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ABSTRACT 

The National Education Longitudinal Study of 1988 
(NELS:88) is designed to monitor the transition of a national sample 
of young adults as they progress from junior to senior high school 
and then to postsecondary education or the world of work. An in-depth 
description is provided of the rationale, development, and 
psychometric properties of the base year test for grade 8. The 
achievement test battery was composed of four tests; (l) re&ding 
comprehension; (2) mathematics; (3) science; and (4) 
history/citizenship/geography. The eighth grade (base year) sample 
was composed of approximately 24,600 eighv.h graders from 1,052 
schools. Results show that the NELS-B8 test battery met or exceeded 
all its psychometric objectives. Reliabilities for the reading 
comprehension, mathematics, and history/citizenship/geography tests 
were acceptable; the science test was somewhat less reliable. 
Internal consistency was high enough to justify item response theory 
scoring. There was no consistent evidence of item bias for gender or 
racial/ethnic groups. Factor analyses support the discriminant 
validity of the four content areas tested. Five tables and seven 
figures complement the discussion. A 32-item list of references is 
included. Eight appendices provide item analysis statistics, 
differential item functioning statistics, item parameters, test 
information functions, descriptions of individual items, 
mtercorrelations of testlets, definitions of proficiency scores, and 
standard errors of aeasureraent at theta scale points. (SLD) 
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EXECUTIVE SUMMARY 



The National Education Longitudinal Study of 1988 (NELS:88) is sponsored by the 
National Center for Education Statistics (NCES) and is designed to monitor the 
transition of a national sample of young adults as they progress from junior to senior 
high school and then on to postsecondary education and/or the world of work. The 
primary purpose of the NELS:88 longitudinal study is to provide policy-relevant 
information on the effectiveness of schools, curriculum paths, special programs, 
variations in curriculum content, and/or mode of delivery in bringing about educational 
growth. 

Among the more important educational indicators that will be monitored at the 
eighth, tenth, and twelfth grade is the achievement test battery. The NELS:88 test 
battery is composed of four separate tests-Reading Comprehension, Mathematics, 
Science, and History/Citizenship/Geography. TI.e NELS;38 test battery is critical to the 
measurement of growth in educational achievement that will take place during the last 
four years of secondary schooling. In addition to providing trend information on 
academic achievement for its longitudinal cohort, the test battery is also designed to 
provide cross-sectional trend information when comparisons are made with the 1980 
High School and Beyond cohorts. 

The NELS:88 base year (eighth grade) sample was composed of ?pproximatelv 
24,600 eighth graders who were sampled from 1,052 schools. 

This report provides an in-depth description of the rationale, development and 
psychometric properties of the eighth grade test. 

The results suggest that the NELS:88 test battery either met or exceeded all of its 
psychometric objectives. The eighth grade analysis indicated that; 

• While the allotted testing time was only one and a half hours, quite acceptable 
reliabilities were obtained for the Reading Comprehension, Mathematics, 
History/Citizenship/Geography, and to a somewhat lesser extent the Science test. 

• The internal consistency reliabilities were sufficiently high to justify the use of 
Item Response Theory (1RT) scoring, and thus provide the framework for 
constructing tenth and twelfth grade forms that will be adaptive 'o the ability 
level of the student. The IRT scaling will enable the researcher to administer 
forms varying in difficulty at the te th grade and to scale these scores on a 
common metric. The choice of test form administered to a student in grade ten 
will be determined by the relative ability level demonstrated by the student in 
grade eight. This adaptive approach will both minimize potential ceiling effects 
and increase measurement accuracy when the students are followed up in the 
tenth and twelfth grades. 
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There was no consistent evidence of differential item functioning (item bias) for 
either gender or racial/ethnic groups. 

Factor analytic results supported the discriminant validity of the four tested 
content areas. Convergent validity was also indicated by salient loadings of 
testlets composed of "marker items" on their hypothesized factors. 

In addition to providing the usual normative scores in all four tested areas, 
behaviorally anchored proficiency scores have been provided in both the Reading 
and Mathematics areas. 
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CHAPTER 1. INTRODUCTION 



The National Education Longitudinal Study of 1988 (NELS:88) is designed to 
monitor the transition of a national sample of young adults as they progress from junior 
to senior high school and then on to postsecondary education and /or the world of work. 
The NELS:88 surveys are monitored by the Longitudinal and Household Studies Branch 
(LHSB) of the National Center for Education Statistics (NCES). NELS:88 is the third 
and most recent in a series of longitudinal studies that are designed to provide timely 
information on trends in academic achievement. The two earlier longitudinal studies 
sponsored by NCES were the National Longitudinal Study of the high school class of 
1972 (NLS) and the High School and Beyond (HS&B) study of 1980. 

The primary purpose of this longitudinal data collection effort is to provide policy- 
relevant information concerning the effectiveness of schools, curriculum paths, special 
programs, variations in curriculum content and/or mode of delivery in bringing about 
educational growth. Although similar in its purposes to its two predecessors (NLS-72 
and HS&B), NELS:88 is more comprehensive in the amount and type of data collected, 
as well as in the time period spanned by the data collection. 

The base year sample was composed of approximately 24,600 eighth grade students 
who were sampled from slightly more than 1000 schools in the spring of 1988. These 
students are being followed up in the tenth grade (first follow-up) in the spring of 1990. 
The second follow-up will take place in the spring of 1992, which would normally be 
their senior year in high school. Attempts will be made to locate and survey sample 
members who have left school by that time or are not high school seniors. Post- 
secondary follow-up surveys are also being planned. 

Among the more important educational indicators that will be monitored by the 
NELS:88 surveys is the achievement test battery. The NELS:88 test battery is critical 
for the measurement of academic growth that takes place between the eighth, tenth, and 
twelfth grades. In addition to measuring longitudinal growth during these critical years 
the NELS.88 battery will also be used to compare the performance of the NELS:88 
sophomores in 1990 with the comparable 1980 sophomore cohort from the HS&B data 
collection, and 1992 NELS:88 seniors with the performance of HS&B and NLS-72 
seniors. 

For sample and race/ethnicity definitions and for detailed information about 
response rates, weighting, sample exclusions and survey methodology, please see the 
Base Year Student User's Manual (Inge^s et a), !990) and the Base Year Sample Design 
Report (Spencer et al, 1990). 

The purpose of this report is to provide an in-depth description of the rationale, 
development, and subsequent statistical analysis of the eighth grade NELS:88 test 
battery. 

1 
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CHAPTER 2. TEST SPECIFICATIONS 



Aims and Objectives 

The test specifications of the NELS:88 longitudinal test battery are dictated by it: 
primary purpose-accurate measurement of the statu- of individuals at a given point in 
time as well as their growth over time. Like its predecessor, the 1980 High School and 
Beyond (HS&B) test battery, the National Educational Longitudinal Study (NELS:88) 
test battery was developed to measure both individual status and growth in a number of 
achievement areas. The four achievement areas are Mathematics. Reading 
Comprehension, Science, and History/ Citizenship/ Geography . However, unlike the 
HS&B assessment which was designed only to measure growth between the tenth and 
twelfth grades, the NELS:88 battery is designed to measure growth in achievement 
between the eighth, tenth and twelfth grades. Since the N ELS: 88 assessment spans four 
years with repeated testing of the same student cohort in the eighth, tenth and twelfth 
grades, it calls for a more flexible testing approach than was required in the HS&B 
longitudinal assessment. 

The construction of the NELS eighth grade battery is in some sense a delicate 
balancing act between several competing objectives. Many of these objectives we 
suggested by thi NELS Technical Review Panel (TRP) and/or NCES project staff 
during the base year development. Some of these objectives were as follows: 

1) That the NELS:88 test battery cover four content areas - Reading, Mathematics, 
Science, and History/Citizenship/Geography. 

2) That there be sufficient common items in the tenth grade mathematics form to link 
v.ith the tenth graJe 1980 HS&B cohort. Since the NELS:88 eighth grade 
mathematics test must also be linked to the tenth grade followup test, it would seem 
reasonable to have the linking items from HS&B be common to both the eighth and 
tenth grade NELS:88 mathematics tests. 

3) That there be sufficient item overlap between the National Assessment of 
Educational Progress (NAEP) mathematics test and the eighth grade NELS:88 
mathematics test to cross-walk to the NAEP mathematics scale if desired. Similar 
overlap was suggested for the NELS:S8 reading test. 

4) That the reading test passages provide reiatively broad content coverage and have 
items that span at least three cognitive process areas. There also should be at least 
one passage that identifies in some way with minority concerns. Similarly, there 
should be at least one passage in which the main character is a female. 

5) The Technical Review Panel suggested that the mathematics test, where possible, 
should emphasize concept understanding and problem solving skills in the areas of 
arithmetic, algebra, and geometry. It was felt that in a building block discipline such 
as mathematics, knowledge of the concepts that form the foundations that are later 
built upon are less likely to be learned and then forgotten. 
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6) The four content areas Reading, Mathematics, Science, and History/Citizenship/ 
Geography must be administered (including time for administration instructions) 
within one hour and a half. 

7) The tests should be sufficiently reliable to support change measurement, and in the 
case of mathematics and reading be characterized by a sufficiently dominant 
underlying factor to support the Item Response Theory HRT) model. This latter 
requirement is necessary to support the vertical equating between retestings as well 
as the cross-sectional linking with H3&B and NAEP, if desired. Given the time 
constraints, this is a "tall order". In order to achieve this level of reliability, as well 
as reduce the possibility of "floor and ceiling" effects, the Mathematics and Reading 
tests will be designed to be multi-level at the tenth grade. 



Two-Stage Testing in a Longitudin al Framework 

The potentially large variation in student growth trajectories over a four year 
period argues for a longitudinal "tailored testing" approach to assessment. That is, in 
order to accurately assess a student's status both at a given point in time as well as over 
time, the individual tests must be capable of measuring across a broad range of 
ability/ achievement. If the same test, in say, Mathematics and Reading Comprehension 
were administered to the same student at the eighth, tenth, and twelfth grades the 
potential for observing "floor effects" at grade eight and "ceiling effects" at grade twelve 
is greatly increased. Of course if all four tests were quite long and included many very 
difficult as well as many very easy items, then theoretically there would be little 
opportunity for floor and ceiling effects to operate. 

Unfortunately operational versions of the test must be relatively short in order to 
minimize the testing time burden on the students and their school systems. One 
potential solution to this problem is to use a two-stage testing procedure that allows one 
to at least partially tailor a test form to a particular individual's ability/achi: .ment 
level. 

That is, a two-stage longitudinal testing procedure will be implemented that would 
use the eighth grade test results for each student to assign him or her to a different form 
of the test when he or she is re-tested in tenth grade. For example, students scoring 
relatively high on the eighth grade test, in say, mathematics would be given a more 
difficult mathematics test form when they are retested as tenth graders. Students scoring 
relatively low in the eighth grade would receive an easier form when retested as tenth 
graders. Since tenth grade students would be taking forms that were in a sense 
appropriate to their particular level of ability/ achievement, measurement accuracy 
would be enhanced and floor and ceiling effects would be mini- ized. The relative 
absence of ceiling effects should make the assessment of gain more accurate for students 
who hao relatively hig/i scores as eighth graders. Similarly, an accurate estimate of gain 
for low scoring cigh'h graders should also be enhanced, since floor effects should be' 
nvr.imhed. 
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What does the utilization of a two-stage procedure have to say about how ihe 
components of the NELS:88 eighth grade battery should be constructed? Since at least 
some of the eighth grade tests (reading and mathematics) are to serve as "brand-. ing" or 
"routing" tests, ideally they should have good measurement properties throughout the 
test score range. That is, the test scores should provide reliable information at both the 
high and the low end of the test score distribution since students in these score ranges 
will be routed to tests of quite different average difficulties in the tenth grade. 

Difficulty Level 

The eighth grade reading, mathematics, and to a lesser extent the science and 
history/citizenship/geography .ests were designed with these broad band measurement 
properties in mind. Operationally the goal of maintaining good measurement accuracy 
throughout the test score range is accomplished by building tests with a relatively 
rectangular frequency distribution of item difficulties. The typical test tends to follow a 
normal distribution of difficulties with the majority of the i^ms in the middle difficulty 
range. A normal distribution of difficulties is considered to ue relatively optimal if: 

1) The population being tested is relatively homogeneous with respect to the 
ability/ achievement being measured. 

2) Diagnostic decisions (e.g., routing to different second stage tests) need not be made 
for individuals at either the high or low end of the test score (ability) distributions. 

3) Reliable measurement of status at a given point in time is of primary importance and 
not the measurement of change. Ideally, change score analysis should be able to 
model a developmental growth model that has students at different points along the 
growth trajectory. If a test is built to simulate the various points along the growth 
trajectory, i.e., some items are selected for inclusion based on how well they 
represent steps in the developmental growth model, then there needs to be a greater 
diversity of item difficulties. Items should not all be "packed" at the middle difficulty 
level since that at best could only reflect accurate measurement of one step in the 
underlying developmental model. 

4) Students are grouped into homogeneous ability/achievement groups based on say, a 
previously administered routing test. Students then could be administered separate 
test forms with each form having the majority of its items at the appropriate difficulty 
level for the corresponding ability grouping. 

At the eighth grade level the total population is relatively heterogeneous. In 
addition, as pointed out above, the present plans call for the tenth grade studems to be 
routed to different test forms dejxmding on how well they did on their eighth grade 
testing. Separate mathematics and reading forms varying in average difficulty will be 
administered to homogeneous groupings of students based on their eighth grade 
achievement scores. These "tailored" test forms will be more homogeneous with respect 
to item difficulties within a test form since they are designed to match the ability level 



of the test taker. However, since one of the purposes of the NELS:88 eighth grade 
battery is to provide diagnostic or routing information ior the succeeding administration 
in the tenth grade, we have emphasized a broader range of item difficulties in the eighth 
grade tests. 

IRT Scaling for Longitu dinal Measurement and Equating to Earlier Cohorts 

In order to accurately measure the extent of eighth to tenth grade gains at both the 
group and individual level, the eighth grade tests and the various forms of the tenth 
grade tests must be caJibrated on the same scale. The most convenient way of doing 
this is to use Item Response Theory (IRT). In order to successfully carry out such a 
calibration for, say mathematics and reading, both the eighth and tenth grade tests 
shoi'ld be relatively unifactorial with the same factor underlying both test 
administrations. This suggests that there be a common set of anchor items across eighth 
and tenth grade forms, and that most, but not necessarily all, content areas be 
represented in both eighth and tenth grade forms. Increments in difficulty demanded by 
future tenth and twelfth grade forms can be accomplished by: (1) increasing the 
problem-solving demands within the same familiar content areas and (2) including 
content in the later forms that tap materials normally found in the advanced course 
sequence. 

The NELS:88 test battery scores must not only be put on the same vertical scales 
(i.e. from eighth to tenth to twelfth grade) but the mathematics items administered in 
the tenth grade must also provide "anchors" to the tenth grade HS&B mathematics items 
administered in 1980. While not required by contract, it would be desirable to be able 
to cross-walk the 1980 HS&3 sophomore reading scores to the 1990 NELS:88 
sophomore reading scores. The ability to put both the HS&B and NELS:88 sophomores 
on the same scale allows for a 10 year span cross -sectional trend comparison as well as 
the potential for a 10 year comparison between the HS&B sophomore to senior gains in 
1980-1982 vs. those made by the NELS:88 students between 1990 and 1992. 
Appropriate use of IRT-scaling for these purposes requires that, to the extent possible, 
the tests be single-factor. 

This cross-sectional scaling in addition to the vertical scaling (eighth through 
twelfth) puts additional constraints on mathematics and reading item selection for both 
the eighth grade and ihe subsequent follow-up tests. That is, in the case of mathematics 
at least 10 to 12 of the items should be common to both the eighth and tenth grade 
NELS:88 battery as well as to the tenth grade HS&B battery. 



Psychometric G oa ls of the NELS:88 Eighth Grade Test Battery 

While the long-term purpose of the NELS:88 battery is to accurately measure the 
status and growth of students at the individual level in four broad achievement areas, 
there are a number of allied psychometric and policy concerns that need to be addressed 
in the eighth grade battery. These concerns are as follows: 
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Item selection should be curriculum-relevant, with emphasis on concepts, skills 
and general principles. When measuring change or developmental growth, the 
overemphasis on isolated facts at the expense of conceptual and/or problem- 
solving skills may lead to distortions in the gain scores due 10 forgetting. More 
will be said about this later. 

• "The tests should be relatively unspeeded with the vast majority of students 
completing all tests. 

• There should be little evidence of floor or ceiling effects if the same test is to 
be repeated in the tenth grade. 

Reliabilities of the component tests should be psychometrically acceptable for 
the purpose of measuring individual status as well as growth. Unlike NAEP, 
which only assesses the status of groups, the NELS:88 battery must assess 
individuals and thus the tests require proportionately greater reliability than do 
their NAEP counterparts. 

• The accuracy of measurement, i.e., the standard error of measurement, should 
be relatively constant across SES, sex ar.d racial /ethnic groups. In fact, the 
NELS:88 battery is fpecifically designed to reduce the gap in reliabilities that is 
typically found between the majority gn up and the racial /ethnic minority 
groups. 

The test components should demonstrate some discriminant validity. That is, 
while the tests should be internally consistent and essentially be unifactoria! (in 
the case of Reading and Mathematics), they should yield a relatively "clean" 
although oblique four factor solution. The four factors should be defined by the 
four tested content areas. 

Subscores and /or proficiency scores should be provided where psychometrically 
justified. The test specifications were designed to provide behaviorally-anchored 
proficiency scores in the areas of Mathematics and Reading. 

The NELS:88 test battery should attempt to minimize Differential Item 
Functioning (DIF) across gender and racial/ethnic groups that arises from 
irrelevant content that favors one or more of the groups. This, of course, refers 
to the so-called item bias problem. 

The NELS:88 test battery should share sufficient common items both across 
grade levels and with the HS&B battery to provide articulation of scores for 
vertical equating in NELS:88 as well as cross-sectional equating with HS&B. 

Many of the following analysis results address the above concerns. 
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Specifications for Individual Tests 



Given that the maximum allowable testing time for eighth graders was 
approximately one hour and thirty minutes, it was decided that the time would be 
apportioned in the following way among the test battery components; 

Reading - Twenty-one questions in twenty-one minutes. 

Mathematics - Forty questions in thirty minutes. 

Science - Twenty-five questions in twenty minutes. 

History /Citizenship/Geography - Thirty questions in fourteen minutes. 

Based on simulations utilizing field test results (Rock & Pollack, 1987), ETS test 
development experts felt that these separately timed content areas would provide 
accurate assessment of each content area while minimizing any speededness component. 
The items that were used in the final eighth grade forms were selected from a much 
larger pool of items composed of items from NAEP, HS&B, the Second International 
Mathematics Study (SIMS), ETS test files from previous operational tests, and a pool of 
items specifically written for the NELS:88 Battery. The selection of items for the pre- 
test item pools was based on the consensus of the members of subject matter 
committees made up of curriculum experts. The subject matter committees consisted of 
educators, teachers, and college professors specializing in middle school curricula. 
There was considerable personnel overlap with similar subject mattei committees used 
in the NAEP item pool development. ETS test development specialists were in 
attendance and worked with their respective subject matter committees in developing 
the eighth grade assessment objectives. Once the assessment objectives were agreed 
upon the subject matter committee members classified the items according to the 
objectives. A pool of 50 Reading items, 82 Mathematics items, 42 Science items, and 60 
History/Citizenship/Geography items was selected for pretesting. Field tests were 
administered to eighth, tenth and twelfth graders in the Spring of 1987 (Rock & Pollack, 
1987). The results of the field testing were scrutinized by additional committees of 
subject matter experts who suggested numerous modifications in content, format and 
wording of the items, as well as making judgments on content coverage. Final revisions 
and item selections were made by project staff on the basis of their input, and reviewed 
by NCES staff. 

The following sections contain descriptions of the content and format of each of 
the four achievement tests. More detailed item-by-item specifications of the curriculum 
content, cognitive process, format, source, and particular content of the test items can be 
found in Appendix E. 



Reading 

The reading test consisted of five reading passages, ranging in length from a single 
paragraph to a half-page. Each passage was followed by three to five multiple choice 
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questions addressing the students' ability to reproduce details of the text, translaie verbal 
statements into concepts (comprehension), or draw conclusions based on the material 
presented (inference /evaluation). A total of 21 questions were presented in 21 minutes. 
The amount of time allowed for each question, which is relatively long compared to the 
other three content areas, takes into account the length of time needed for reading the 
passages before answering the questions. 

The reading test began with the least difficult (literary) passage followed by five 
relatively easy questions. The percent answering each item correctly (P+ a measure of 
item difficulty) by total and subgroups is presented in Appendix A-l. The next passage 
was a short science passage followed by three questions. These three questions were 
more difficult than those associated with the literary passage. The increased difficulty 
could be due to the science content or the fact that the questions went beyond simple 
reproduction of detail. The next passage was a six item poetry passage. The item 
difficulties variec from relatively easy to relatively difficult. The fourth passage was a 
biographical piece concerning the Black jazz musician Louis Armstrong and was 
followed by four questions of medium difficulty. The last three items were based on a 
passage discussing the role of pioneer women. These items were relatively easy. The 
first eight items in the reading test used a five option multiple choice format while the 
remaining fifteen items used a four option multiple choice format. Other than to 
present r* relatively easy passage first no conscious attempt was made to present the 
remaining items in order of difficulty. The motivation for including several very easy 
items on this test came from the field test results. Pretesting of the reading materials 
indicated the possibility for floor effects for some individuals. 

Figure 1 presents a two-way table of reading passage content categories by 
cognitive process categories for the reading test. The entries in the cells of the matrix 
are the number of items in that particular cross-classification. Appendix E-l contains 
additional details on the content and characteristics of individual items. 

Inspection of Figure 1 indicates that the eighth grade test attempted to cover as 
many content areas as possible given the limitations inherent in the time allocation. In 
order to achieve a reasonable level of discrimination for the low, middle and higher 
level readers, there were items requiring simple reproduction of detail as well as items 
requiring comprehension and inference skills. One passage (the biographical passage) 
discussed the life of a Black musician. The primary characters in one of the other 
passages were women pioneers. The remaining passages did not contain references to 
the race /ethnicity of the characters, and the gender of the characters was not an 
important issue. This attempt to balance the content of the reading passage with respect 
to gender and race /ethnicity represents an effort to reduce the potential for bias 
affecting subgroups of the population. 

As expected, the comprehension and inference/evaluition items tended to be 
somewhat more difficult than those items requiring simple eproduction of detail. While 
the comprehension and inference/evaluation items were more difficult on average than 
the reproduction of detail items, they were purposely designed not to be extremely 
difficult for the typical eighth grader for two reasons: 
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Figure 1. -Reading test specifications (number of items by process and content) 





CONTENT 


PROCESS 


Literary 


Science 


Poetry 


Biography 


Reproduction 
of detail 


3 


1 






Comprehension 


• 


1 


1 


1 


Inference and /or 
Evaluation 


5 


1 


5 


3 



10 

20 



1) We were not concerned about ceiling effects at grade 8 imposing artificial 
constraints on eighth to tenth grade gains since we were planning to route students 
to forms that would be appropriate for their ability level at the tenth grade. 

2) We were attempting to increase the accuracy of measurement for the low SES 
and/or racial/ethnic groups who traditionally score lower on cognitive measures. 
The trick is to accomplish this goal without sacrificing the overall reliability, i.e., 
the reliability estimated for the total population. Widening the range of item 
difficulties to include several very easy items was intended to aid in reaching this 
objective. 



Mathematics 

The proportion correct (P + ) for the mathematics test items are presented in 
Appendix A-2. The first 19 items in the mathematics test are referred to as quantitative 
comparison items. While these items follow the multiple choice mode they have a 
somewhat different format than the typical multiple choice item. The student is 
presented with two quantities-one in column A and one in column B. He or she is then 
asked to compare the two quantities and mark option (A) if the quantity in column A is 
greater; (B) if the quantity in column B is greater; (C) if the two quantities are equal; 
and (D) if the relationship cannot be determined from the information given. 

These first 19 quantitative comparison items cut across most of the content areas 
but tended to be classified as skills and/or declarative knowledge or understanding/ 
comprehension of concept. The quantitative comparison item type was included in the 
mathematics test for two reasons. First and primarily, this was the only item type used 
in the HS&B mathematics test and thus they can provide us with the common item 
anchors needed for the cross-sectional equating. Secondly they tend to take less time to 
administer than other formats and thus the student can do approximately three 
quantitative comparison items for every two standard multiple choice items, Assuming 
equal item reliabilities we can achieve significantly higher test reliability for a fixed 
amount of testing time. Inspection of the item biserials (a measure of an item's 
reliability) in Appendix A-2 does suggest that the item reliabilities of the quantitative 
comparison and the standard multiple choice are about the same. 

One additional concern about the quantitative comparison item types is that the 
format might be sufficiently unfamiliar to some of the students to make them artificially 
difficult. Inspection of the item difficulties in Appendix A-2 suggest that they appear to 
run the gamut from easy to hard. The finding that they are not differentially difficult for 
minority groups will be treated in the section dealing with differential item performance. 

The remaining mathematics items are the standard 4 option and 5 option multiple 
choice items types, containing a mix of word problems, diagrams, and calculations. 
There is a slight ordering with respect to difficulty since the more difficult problem 
solving items were placed near the end of the test. 
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Figure 2 presents the test specifications in terms of item classifications for the 
eighth grade mathematics test. See Appendix E-2 for content information on an item- 
by-item basis. 

Inspection of Figure 2 indicates that nearly half of the of items in the eighth grade 
mathematics test can be classified as requiring skills or declarative knowledge. The 
"skills and declarative knowledge" category actually includes two relatively separable 



Figure 2- Mathematics specifications (number of items by process and content) 





CONTENT 


PROCESS 


Arithmetic 


Algebra 


Geometry 


Data/ 
Probability 


Advanced 
Topics 


Skills/ 
Knowledge 


10 


4 


1 


1 


1 


Understanding/ 
Comprehension 


6 


7 


3 


3 




Problem Solving 


3 




• 




1 
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knowledge demand levels. The lowest level consists primarily of simple arithmetical 
operations on whole numbers and the second level requires skills in operations with 
decimals, fractions, and percentages. The "understanding/comprehension" level consists 
of items that require translating verbal statements and concepts into figures, and 
demonstrating understanding of concepts and principles through explanation, recognition 
or illustration. For example, arrival at the cc*Tect answer may involve understanding 
the relationship between decimals and percentage*, etc. The higher order problem 
solving category is less well defined at this level (eighth grade; but it typically involves 
generalizing and applying mathematical knowledge, skill and comprehension in situations 
requiring reasoning, judgment, and decision-making processes. It is anticipated that the 
tenth grade mathematics forms will include a larger representation of items requiring 
problem solving skills. 

It should be pointed out here that when one computes content subscores based on 
say, the arithmetic and algebra items, one should not be surprised if such subscores are 
very highly correlated since both content areas include similar item distributions with 
respect to cognitive demands (i.e., processing demands). Most students, by the eighth 
grade, have been exposed to instruction in the skills needed to solve the lowest level 
(Skills/ Knowledge) items. Therefore, individual differences in performance are going to 
be driven by differential exposure and practice in the higher- level skills related to 
concept understanding and simple problem solving. 

Subscores or proficiency scores based on the rows (cognitive processes) of the 
above classification matrix may have a greater potential for discriminate subscores than 
are the columns (Content areas). The rows that define the cognitive processes lend to 
follow a difficulty hierarchy. That is, the skills at each higher level require nil the skills 
of the lower levels plus some new additional skill. This hierarchy in complexity tends to 
make subscores based on items describing these different cognitive process levels 
somewhat more differentiable than those based on the content areas. The increase in 
conceptual complexity as one goes from the simple rule-following of the declarative 
knowledge items to the item types representing conceptual understanding and finally 
problem solving, suggest that possibly qualitatively different skills come into play as one 
proceeds up the "ladder" of complexity. 



Science 

The item format for the science test is the standard multiple choice format with 
approximately two-thirds being four choice and the remaining items five choice. The 
majority of the items contain a verbal description of a situation followed by a question 
based on the premise. Several items include graphs or diagrams illustrating the 
circumstances described. There is a considerably stronger relationship between item 
sequence and item difficulty in the science test when compared to the reading and 
mathematics tests. That is, inspection of Appendix A-3 indicates that there is a relatively 
consistent increase in item difficulty as one proceeds from the beginning to the end of 
the test. Indeed the science items were ordered to reflect their pretest difficulties. 
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Figure 3 presents a two-way table of the classification of The Science items. 
Additional detail on characteristics and content of individual items can be found in 
Appendix E-3. 

Since no computations are involved in the science items (unlike the higher level 
mathematics items) and inferences from facts may be more straightforward than in the 
reading comprehension test, often understanding the concept is tantamount to solving 
the item. As a result these process classifications in science are particularly sensitive to 
differences in opinion among science experts. Content areas in science also have a 
tendency to overlap with each other. While this is true for the other areas also, it is 
especially true for science items. 



Histor) /Citizenship/Geography 

The History/Citizenship/Geography te*i items were only classified according to 
content area. Of the 30 items in the test, fourteen were history questions; thirteen were 
citizenship/government questions, and the remaining three items dealt with geography/ 
economic development. 



Figure 3.*-Science test specifications (number of items by process and content) 





CONTENT 


PROCESS 


Earth 


Life 


Chemistry 


Scientific 
Method 


Declarative 
Knowledge 


5 


3 






Comprehension 


2 


2 


2 


1 


Problem 
Solving 


1 


3 


3 


1 
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The three content areas were distributed throughout the tesc. The items were 
sequenced for the most part on the basis of their pre-test difficulties with the easier 
items in the beginning and the most difficult items near the end. Appendix A-4 presents 
the item difficulties. Content, source, and descriptive information on each item can be 
found in Appendix E-4. The item format consisted of twenty-two four option multiple 
choice with three five option multiple choice and five true-false items. 



Matching Test Content to Curriculum 



The question of overlap between test items and curriculum content has received 
increasing attention over the last ten years and evaluation methodologies have come to 
be dominated by the doctrine of maximal overlap (Frechtling, 1989). Mehrens (1984) 
and Cronbach (1963), however, questioned whether maximal overlap is in fact desirab e 
except possibly in those cases where a specific program is being evaluated. Mehrens 
argues that a close match between curricular and test content is desirable only if one 
wishes to make inferences about specific objectives taught by a specific teacher to a 
specific school. Even if one would wish to evaluate the effects of a specific teacher in a 
specific class, one inference of importance : s the degree to which the specific knowledge 
taught in that class generalizes to other i ^vant domains. 

Nitko (1989) argues that tests designed to measure individuals and to facilitate 
their learning within a particular instructional context are not necessarily optimum tor 
measuring school or program differences. Similarly Airasian & Madaus (1983) suggest 
that the following design variables be taken into account: 

(A) The ability of tests to detect differences between groups of students. 

(B) The relative representativeness of the content-behavior-proccss sampled by 
test items. 

(C) The parallelism of the response formats and mental processes learned during 
instruction with those defined by the test tasks. 

(D) The properties of the scores and the way that they will be summarized and 
reported. 

(E) The validity of the inferences about school and program effectiveness that 
can be made from the test results. 

Experience and practice suggests that tests are unlikely to detect differences 
between schools and programs when total test scores are used and when the subject 
matter tested is likely to be related to learning in the home (e.g. reading) rather than to 
schooling (e.g. mathematics) (Airasian & Madaus, 1983; Linn & Harnisch, 1981). 

Schmidt (1983) identifies three major types of domains from which content to be 
covered can be drawn: a priori domains, curriculum-specific or learning-material-specific 
domains, and instructional material domains. Nitko (1983) suggests that "agents" not 



15 



associated with local schools or particular programs tend to define a priori domains by 
using social criteria in judging what is important for all to learn. He goes on to suggest 
that test exercises in the National Assessment of Educational Progress (NAEP) as well 
as state assessment programs are examples of assessment instruments built from a priori 
domains since they specify content to be included without linking that content to specific 
instrut.ional material or specific instructional events. 

Cole & Nitko (1981) suggest that another design variable be considered in building 
tests to detect school and program effectiveness. They suggest that students require 
more time to acquire global skills and to grow in general educational development than 
to learn specific knowledges and skills. They suggest that tests measuring the former are 
less sensitive to measuring short term instructional efforts than tests measuring the 
latter. 



Cooley (1^77) and Leinhardt (1980) argue for the collection of relevant classroom 
variables and developing tests that arc sensitive to differences between classrooms 
within-program. Leinhardt & Secwald (1981) describe several within-school, program, 
and classroom variables that are important to program evaluators and how to measure 
them. Menrens and Phillips (Mehrens, 1984; Mehrens & Phillips, 1986; Phillips & 
Mehrens, 1988), however, found no significant differences on standardized tests from the 
use of different textbooks and different degrees of curriculum-test overlap when previous 
achievement and socioeconomic status were taken into account. 

What we have attempted to do here is take kind of a middle road in the sense that 
our curriculum experts were instructed to select items that were curriculum relevant but 
typically did not require a great deal of isolated factual knowledge. The emphasis was 
to be on understanding concepts and the measurement of problem-solving skills. 
However, it was thought necessary to assess the basic operational skills (e.g., simple 
arithmetic and algebraic operations) which are the foundations for successfully carrying 
out the problem solving tasks. 

The incorporation in the mathematics test of the relatively simple arithmetic and 
algebraic items which measure procedural or factual knowledges served two purposes. 
First, this subset of items provided better assessment for those low scoring students who 
% ^re just beginning to develop their "basic mathematical skills". Second, these items 
should be able to provide a limited amount of diagnostic information about why some 
students are not able to successfully carry out the tasks defined in the typically more 
demanding problem solving items. For example, students who are not proficient on the 
problem solving items can be further divided into two groups based on their 
performance on the arithmetical /algebraic procedural skill items. One subgroup could 
not very well be proficient on the problem solving items since they did not demonstrate 
sufficient skills on the simple arithmetical /algebraic procedures that are a necessary but 
not a sufficient condition for successful performance on the problem solving tasks. The 
remaining subgroup, however, had sufficient grounding in the basics as demonstrated by 
their successful performance on the procedural items but were unable tc carry out the 
logical operations necessary to complete the solutions to the problem solving items. 
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This hierarchical nature of the required skills is put to formal use in the 
development of behaviorally anchored proficiency level scales for both reading and 
mathematics. This criterion referenced interpketation is discussed further on under the 
subtopic Proficiency Level Subscores. 

This concern with respect to the maximal overlap doctrine is particularly relevant 
to the measurement of change over relatively long periods of exposure to varied 
educational treatments. That is, the two year gaps between rc-testings coupled with a 
very heterogeneous student population are quite likely to coincide with considerable 
variability in course taking experiences. This fact along with the constraints on testing 
time, makes coverage of specific curriculum x 1 ited knowledges very difficult. Also, as 
indicated above, specificity in the knowledges being tapped by the cognitive tests could 
lead to distortions in the gain scores due to forgetting of specific details. It is our 
opinion that the impact on gain scores due to forgetting will be minimized if the 
cognitive battery increasingly emphasizes general concepts and development of problem 
solving abilities. This emphasis should increase as one goes to the tenth and twelfth 
grades. Students who take more high level courses, regardless of the specific course 
content, are likely to increase their conceptual understanding as well as gain additional 
practice in problem solving skills. 

At best any nationally based longitudinal achievement testing program must be a 
compromise that best attempts to balance testing time burdens, the natural tensions 
between local curriculum emphasis and more general mastery objective*, and the 
psychometric constraints (in the NELS:88 case) in carrying out both vertical equating 
and cross-sectional equating. NELS:88 fortunately does have the luxury of being able to 
gather longitudinal pre-test data on the item pools. Thus we have been able to take 
into consideration not only the curriculum relevance but whether or not the items 
deiiK-nstrate reasonable growth curves, as well as meet the usual item analysis parameter 
requirements for item quality. 
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CHAPTER 3. PSYCHOMETRIC ANALYSIS RESULTS 



Were the Tests Speeded? 

ETS uses a two-part "rule-of-thumb" foi determining whether or not a test is 
speeded. A test is considered to be unspeeded if nearly all test-takers reached the 
three-quarters point of the test, and at least 80 percent of the students answered the last 
item. The first criterion was met by 97 percent or more of students in all subgroups for 
all four NELS:88 tests, with the exception of Black students, 95 percent of whom 
reached the three-quarters point on the reading ttst. Table 1 below presents the 
statistics for the second criterion, percent answering the last item. Inspection of the 
entries in Table 1 indicate that ail tests exceeded this criterion by a considerable margin 
for all groups. In a test such as NELS:88, which represents a "no risk" situation for the 
student, failure to answer items may be due to a lack of motivation as well as to 
insufficient time. It is evident that the allocated test timings were appropriate for all 
eighth grade groups. 



Table 1. -Speeded ness indices for tests, by racial/ethnic and sex groups 
(percent of sample who reached last item) 



TEST Asian Hispanic Black White Male Female 



Reading 


96.1 


92.7 


87.9 


97.3 


94.9 


95.9 


Math 


96.1 


93.2 


89.7 


96.2 


95.0 


94.9 


Science 


96.2 


95.3 


92.6 


98.0 


96.7 


97.0 


Hist./Citiz. 


96.6 


95.5 


94.6 


97.9 


97.0 


97.3 



SOURCE: U.S. Department of Education, National Center for Education 
Statistics, NELS:88 Base Year Survey. 



Reliabilities of the NELS:88 Eighth Grade Test Batt ery 

Table 2 presents the reliabilities and standard errors of measurement for 
racial /ethnic and sex groups for each test in the NELS:88 eighth grade battery. These 
reliabilities are based on weighted data. For comparison purposes the reliabilities and 
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standard errors of measurement are also shown for the analogous components of the 
HS&B sophomore test battery (Rock et ah, 1985). The reliabilities are internal 
consistency measures based on coefficient Alpha. High coefficient Alpha reliabilities 
(eighties and above for tests of this length) suggest that the tests are relatively 
unifactorial. While standard errors of measurement (SEM's) are presented for both the 
NELS:88 and the HS&B battery, they (the SEM's) are not strictly compar Me, since 
both the instruments and the populations are different. In such cases, reliabilities are 
the preferred measure of accuracy. 

The results in Table 2 suggest that the reading and math tests in the N ELS: 88 
battery provided an increment in reliability ever that provided by their counterparts in 
the HS&B battery. This increment in reliability is particularly noticeable in the reading 
area and to a somewhat lesser extent in mathematics. The large gains in reliability in 
these two content areas are particularly welcome since they seem to be greatest for the 
minority populations. It was hoped that the reliabilities of the traditionally lower scoring 
groups, e.g., Blacks and Hispanics, could be increased without an accompanying decrease 
for the white majority. As indicated earlier one of the test construction goals in 
mathematics and reading was to provide a more rectangular distribution of difficulties 
across the low and middle difficulty levels, thereby providing additional discrimination at 
the low end of the test score distribution. 

One should keep in mind here thr we are comparing different populations. A 
more accurate summary of Table 2 is that the NELS:88 reading and mathematics tests 
do a better job of assessing eighth graders than did the comparable tests in the HS&B 
battery when administered to tenth graders. It should also be pointed out that the 
NELS:£8 mathematics test included two more items than did its counterpart in HS&B. 
Similarly, the NELS:88 reading test had one more item than did its counterpart in 
HS&B. These differences in numbers of items are not of sufficient size to completely 
explain the gains in reliability. The increased overall reliability (i.e., for the total 
sample) is more likely to have resulted from the fact that the test specifications took 
into consideration the intention of tailoring the tenth grade follow-up test forms (at least 
in reading and mathematics) to the ability of the students as described by their eighth 
grade scores. That is, since the eighth grade test was not intended to be re-used at 
tenth grade, it could be constructed to best measure the range of achievement expected 
in the base year without concern for potential ceiling effects later on. HS&B used the 
same test forms to measure students in both tenth and twelfth grades. This implies 
some compromises in test specifications, a constraint which was not in effect in designing 
the NELS:S8 tests. 

Knowing that we were intending to change the tenth grade test allowed the test 
developers to build an eighth grade test that only needed to maximize the accuracy of 
assessment at the eighth grade. If the test development project staff had been directed 
to build a residing and mathematics form that was to be the same for both eighth and 
tenth grader*, then the final .ghth grade form would have been more difficult on 
average in order to minimize ceiling effects at the tenth grade level. The increased 
difficulty would, of course, tend to reduce the reliability of the eighth grade test, 
particularly for the low scoring individuals. 
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Table 2.--Test reliabilities and standard errors of measurement (in parentheses), 
by race/ethnicity and sex 





Asian 


Hispanic 


Black White 


Male 


Female 


TOTAL 








RcADlNO 








NELS:88 Rel 
NELS:88 SEM 


.85 
(2.43) 


.79 
(2.57) 


.77 .83 

t"% tS\\ (*) A~1\ 


.84 


.83 

\L&0 ) 


.84 
\£.av) 


HS&B Rel 
HSB SEM 




.64 
(2.30) 


.66 .76 
(2.23) (2.28) 


.77 
(2.29) 


.76 
(2.27) 


.77 
(2.28) 








MAT HbMA I lv-o 








NELS.88 REL 
NELS:88 SEM 


.92 
(3.46) 


.86 
(3.70) 


.84 .89 


.90 


.90 
ex sx^ 


.90 


HSB REL 
HSB SEM 




.79 
(3.57) 


.76 .87 
(3.51) (3.51) 


.88 
(3.51) 


.85 
(3.53) 


.87 
(3.52) 








SCIENCE 








NELS:88 REL 
NELS.88 SEM 


.77 
(2.89) 


.67 
(2.98) 


.62 .74 
(2.96) (2.90) 


.78 
(2.86) 


.72 
(2.92) 


.75 
(2.91) 


HSB REL 
HSB SEM 




.68 
(2.44) 


.64 .69 
(2.40) (2.33) 


.76 
(2.32) 


.71 
(2.40) 


.74 
(2.36) 






History/ Citizenship/Geography 






NELS:88 REL 


.86 


.81 


.76 .83 


.85 


.82 


.83 



NELS:88SEM (3.03) (3.33) (3.38) (3.01) (3.06) (3.10) (3.15) 
- No Comparable test in the HS&B Battery- 

SOURCE: U.S. Department of Education, National Center for Education Statistics, 
NELS:88 Base Year Survey and High School and Beyond Base Year Survey. 
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It was encouraging to observe that the eighth grade NELS:88 Science test achieved 
about the same degree of reliability as the tenth grade HS&B test. One would not 4 
expect many eighth graders to be exposed at this point in their development to some of 
the material in the Science test. Given the number of life and earth science items and i 
to a lesser extent chemistry items, it is believed that the test will be more appropriate 
when given to tenth graders who will have been exposed to additional eoursework in f 
these areas, and thus should show additional incremental gains in measurement accuracy • 
at that point in time. 

Similar to the Reading and Mathematics test, the History/Citizenship/Geography 
(HCG) test also demonstrated relatively high internal consistency reliability. The 
internal consistency reliability of the HCG test was sufficiently high to suggest that 1RT 
methods could be used to put more than one form on the same scale if required in the 
follow-ups. Inspection of histograms and p-plots for the HCG test suggest a slight 
ceiling effect if we used the same form again in the tenth grade. 

A simple descriptive index of the potential for a ceiling effect is the difference 
between the mean and a perfect score divided by the standard deviation. If the 
distribution is relatively normal in the sample, then there should be slightly more than 2 
standard deviations between the mean and a perfect score. In the case of the Science 
test this index is equal to 2.47, indicating almost two and a half standard deviations 
between the eighth grade mean and a perfect score. In addition, both histograms and p- 
plots of the Science scores suggest that the sample distribution more nearly 
approximates a normal distribution than that of any of the other tests. 

The same index for the HCG test is equal to 1.87 suggesting that there is some 
potential for a ceiling effect here if the same form were used at the tenth grade. The 
results of the follow-up pretest (Rock & Pollack, 1989) also suggested the need for a 
vertically equated more difficult tenth grade form. 

Originally both the Science and the HCG tests were considered to be candidates 
for keeping the same form at least through the tenth grade. There is little evidence 
arising from the eighth grade data that suggests that this may not be a viable way to go 
in the case of the Science test. Also using IRT methods for putting different forms of 
the Science test (e.g., different tenth & twelfth grade forms) on the same scale might be 
somewhat problematic because of the relatively low internal consistency of science items. 
Fortunately the HCG test appears to be sufficiently internally consistent for lRT scaling 
and thus there is the potential for including more difficult items in the tenth grade test. 



Item Statistics bv Gender and Racial /Ethnic Groups 

Appendices A1-A4 present traditional item analysis statistics including the item 
difficulties (P+), item biserials, and deltas. The item difficulties are simply the 
proportion of students who passed a particular item. The item biserials are measures of 
the relationship between performance on a given item and on the total pool of items as 
measured by the total score. The item biserial is often considered to be a measure of 
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given item's reliability. Another way of looking at the biserial is that its size reflects the 
extent to which a given item measures the "same things" as the remainder of the test 

Items yielding biserials of .40 are considered to be quite reliable while items at .SO 
and above are considered to have excellent reliability. Items that have biserials in the 0- 
20 range, or worse yet are r jgative, would be candidates for replacement 

The item deltas are defined as A ■ 4 (1-P,*) + 13 where is the inverse 
normal transformation that transforms a probability value into a normal deviate with 
unit variance. Thus the distribution of item deltas will have a mean delta of 13 and a 
standard deviation of 4. Item deltas are used by ETS test development specialists as the 
index of item difficulty in defining test specifications. 

In Appendices A1-A4, at the bottom of each column are summary statistics for the 
item analysis. The item biserials for the N£LS:88 battery are all positive and relatively 
high for all groups. There is, however, a consistent tendency for the biserials to be 
somewhat lower for the Hispanics, Blacks, and American Indians. This is at least partly 
an artifact of the slightly lower total test score variances for these groups. Table 3 
below summarizes the item difficulty and biserial information by content area and 
compares these with their counterparts from the HS&B tenth grade data. As expected, 
the average biserial was somewhat higher for the NELS:88 reading and mathematics 
tests than for their counterparts in the HS&B battery. This finding is consistent with the 
higher reliabilities reported above for the NELS:88 reading and mathematics tests. 

The fact that on average the NELS:88 reading and mathematics tests were 
somewhat easier than their HS&B counterparts (i.e., higher average P + ) was also 
consistent with the design specifications that attempted to increase the reliability for the 
traditionally lower scoring groups. That is, the N£LS;88 reading and mathematics tests 
had proportionately more easy items than did the HS&B battery. The larger number of 
easy items minimized the possibility of observing "floor effects" for the low scoring 
groups. As indicated above, the eighth grade test specifications were less driven by 
concerns about ceiling effects in the later followups than was the case for HS&B, since 
different and more difficult forms would be introduced at the tenth grade for NELS. 

Unlike the reading and mathematics content areas, the science area was slightly 
more difficult for eighth graders than the comparable test for the KS&B tenth graders. 
This was anticipated since many eighth grade students probably had little familiarity with 
some of the content in the Science test. 

Compared to the remainir i tests in the NELS:88 battery, the average difficulty of 
the HCG test items suggests that it was the easiest test. This result is, of course, 
consistent with the earlier finding of a potential ceiling effect if the same form were 
used again in the tenth grade. 
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Table 3.--A comparison of average difficulty and average biserials for 
comparable tests in the HS&B and NELS:88 test battery 



NELS:88 Eighth Grade Average HS&B Tenth Grade Average 

£+ Bisenal Z± Bissrial 



READING 



Asian .63 .65 Not available 

Hispanic .52 .57 38 .48 

Black .49 .55 .37 50 

White .65 .64 .52 .57 

TOTAL .61 .64 .48 57 



MATHEMATICS 



Asian .61 .64 Not available 

Hispanic .45 51 .39 .44 

Black .4! .49 .36 .42 

White .58 .57 .53 .53 

TOTAL .54 .58 .49 .53 



SCIENCE 

Asian .56 .51 Not available 

Hispanic .46 .43 .45 .48 

Black .42 .41 .41 .46 

White .57 .49 .59 .52 

TOTAL .53 .49 .55 .54 



Asian 


.67 


.62 


Hispanic 


.56 


.51 


Black 


.54 


.48 


White 


.66 


.59 


TOTAL 


.63 


.58 



History/ Citizenship/Geography 

No comparable test 



SOURCE: U.S. Department of Education, National Center for Education 
Statistics, NELS:88 Base Year Survey and High School and Beyond Base 
Year Survey. 
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Differential Item Functioning (DIF) 



Differential Item Functioning (DIF) as defined here attempts to identify those 
items showing an unexpectedly large difference in item performance between a focal 
group (e.g. Black students) and a reference group (e.g. White students) when the two 
groups are "blocked" or matched on their total score. It should be noted that any such 
strictly internal analysis, i.e., without an external criterion, cannot detect bias when that 
bias pervades all items in the test (Cole & Moss, 1989). It c*m only detect differences in 
the relationships among items that are anomalous in some group in relation to other 
items. In addition such approaches can only identify the items wheie there is 
unexpected differential performance, they cannot directly imply bias. A determination 
of bias implies not only that differential performance on the item is related to subgroup 
membership, but also that the difference is unfairly associated with subgroup 
membership. That is, the difference is due to an attribute not related 10 the construct 
being measured. As Cole & Moss (1989) point out, items so identified must still be 
interpreted in light of the intended meaning of the test scores before any conclusion of 
bias can be drawn. 

The DIF program was developed at the Educational Testing Service (Holland and 
Thayer, 1986) and was based on the Mantel-Haenszel odds-ratio (Mantel and Hamszel, 
1959) and its associated chi-square. Basically, the Mantel-Haenszel (M-H) procedure 
forms odds ratios from two-way frequency tables. In a twenty item test, 21 two-way 
tables and their associated odds-ratios can be formed for each item. There are 
potentially 21 of these tables for each item since there will be one table associated with 
each total score from 0-20. The first dimension of each table is groups, e.g., Whites vs. 
Blacks, and the remaining dimension is passing vs. failing on a given item. Thus the 
question that the M-H procedure addresses itself to is whether or not members of the 
reference group, e.g., Whites, who have the same total score as members of the focal 
group, e.g., Blacks, have the same likelihood of passing the item in question. While the 
M-H statistic looks at passing rates for two groups while controlling for total score, no 
assumption need be made about the shape of the total score distribution for either 
group. 

The chi-square statistic associated with the M-H procedure tests whether the 
average odds-ratio across all 21 score levels differs from unity, i.e., equal likelihood of 
passing. 

Three columns in the M-H tables are of particular interest. The first of these 
three columns is labeled "prob > Chi-sq" and it provides a statistical test of whether or 
not the average odds-ratio significantly departs from unity. If the probability in this 
column is .05 or less then one could say that there is statistical evidence for DIF on the 
item in question. The problem with this interpretation is two-fold. First, one is making 
a number of statistical tests, one for each item, and second, if there are two relatively 
large samples involved, statistical significance will be guaranteed. 
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Given these reservations the Educational Testing Service has developed an "effect 
size" estimate that is not sample size dependent. These effect sizes are in the column 
labeled MH D-DIF. Associated with the effect sizes is a letter code that ranges from 
"A" to "C. It is ETS's experience that effect sizes of 1.5 and above are practically 
significant. Effect sizes of this magnitude, and which are statistically significant, are 
labeled with a "C . Test development experts can often inspect items that are 
characterized by such large DDF properties and in some cases be able to provide a 
reasonable explanation for the differential item functioning. This has not been the case 
for items in the A or B DIF categories. The negative sign on the M-H D-DIF column 
indicates that the DIF is favoring the reference group and is against the focal or target 
group (typically the minority group). The third and last column of interest is the column 
labeled impact. This column simply shows the raw differences in the P+'s when the 
focal group's P + is subtracted from that of the reference group. 

If DIF statistics have been obtained on pretested items, all "C items will normally 
be replaced in construction of an operational test, unless they are needed to meet test 
specifications. This is done regardless of whether the group differences are related to 
the construct. Once a test has been administered, however, replacement of items is no 
longer an option; the only choice possible is whether to accept the questioned item or 
drop it from scoring. At this stage, it has been the policy of the Educational Testing 
Service to submit items having "C" level DIF statistics to a test development committee 
for review. If the committee can identify content that is likely to be unfamiliar to the 
subgroup in question and which is irrelevant to the skill being measured the item will 
typically be removed from the test score. However, if the identified source of difference 
is consistent with the construct being measured, or if no reason for the difference can be 
determined, the item is retained. 

Appendices B1-B20 present the tables of differential item functioning which 
compares the base or reference group (Whites or males) with each of the racial /ethnic 
or female comparison groups. For each test content area there are five DIF tables. For 
example, Appendix Bl presents the contrast between Whites ana Asians on each of the 

reading items. Appendices B2-B4 present contrasts between Whites and Hispanics, 
Blacks, and American Indians respectively. B5 presents tbi contrast between male and 
female on the reading items. Appendices B6-B20 repeat the same contrasts for the 
remaining three content areas. 

Inspectic n of the effect size columns suggest that there is little or no evidence for 
the presence of DIF in the NELS:88 test battery. In the case of reading there is only 
one "C level item and its sign is positive indicating that the DIF is favoring the focal 
group (American Indians in this case). There are 116 items in the NELS:88 Battery and 
there are 580 DIF contrasts being made. Because of the large number of contrasts 
being tested we will emphasize those items that show DIF for two or more groups. 

The only "C level item in the reading test heavily favored American Indians over 
Whites. However, an artifact of the computational formulas in the DIF procedure is 
that easy items are much more likely to be identified as showing DIF than hard items. 
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Reading item 1, with a P+ of .96 for Whites and .95 for American Indians, was by far 
the easiest -tern in the whole test battery. 

In the case of the mathematics test there were only two H C" level DIF items. Item 
25 favored the Whites over the Black students and also favored the male students over 
the female students. Item 25 requires only simple arithmetical operations but the units 
are in centimeters. It is possible that both Black and female students may be somewhat 
less comfortable with the concept of centimeters as the units of measurement. Item 37 
favored the reference group (Whites) when compared with the focal group (Asians). 
Item 37 is a low level problem solving geometry problem which uses the term "stick- 
lengths" in the stem. It is possible that this hyphenated word was confusing to some of 
the Asian students. Inspection of the item biserial for the Asian group (Appendix A2) 
indicates that it is quite high (.69) suggesting mat it does appear to be quite reliable and 
is discriminating the high scoring Asians from the low scoring Asians. 

As mentioned earlier in the discussion of the quantitative comparison items, there 
is some concern about the possibility that they might be unfair to minority groups on the 
basis of their potential lack of exposure to the item format. Inspection of the first 
nineteen items (the quantitative comparison items) in appendix B-6 indicates that there 
are no "C" level items among the quantitative comparison items for any focal group 
comparison. In terms of "B" level items, the Asians have two- one in favor of the focal 
and one in favor of the reference group. When the Hispanics are the focal group all the 
contrasts for the first nineteen items are at the "A" level (difference is small and/or not 
statistically significant) and most of those favor the focal group. There are two "B" level 
quantitative comparison items in the Black vs. White student comparison. In both cases 
the items favor the focal group (Black students) rather than the White reference group. 
The American Indian-White student comparison only showed "A" level contrasts. It 
would appear that there is no evidence for DIF among the quantitative comparison 
items. 

The science test had only one "C level item (item 14) and that appeared to favor 
White students over Black students. This item refers to the temperature of a mixture of 
two liquids. Subsequent review of this item by the test development committee came up 
with no insights on why this item showed DIF. As in previous examples of item DIF, 
this particular item had a respectable biserial (.50) for the Black students. 

Item 21 seemed to favor male students over females. Question 21 deals with how 
the interaction of water temperature and that of the land generates a sea breeze at the 
beach. A review of the item failed to identify any gender linked problems. 

The HCG test had 5 items that showed "C* levels of DIF. Of particular interest 
here was item 9 which showed DIF in favor of the White students when compared with 
the Asian students, Hispanic students, and the American Indian students. Item 9 asks 
the student whether "refusing to obey laws" is a way that American citizens can legally 
oppose laws or actions of officials. While the biserials are quite high for this item in all 
the subgroups in question, this item may be measuring an attitude towards protest rather 
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than knowledge of what is legal and what is not legal. This item is a reasonable 
candidate for replacement in the tenth grade test. 

Item 14 also yielded "C" level DIF statistics in two reference - focal group 
comparisons. The interesting finding about this item is that it favored the focal groups 
(Asian and Hispanic students). Item 14 asks about regions of the world that "the 
greatest number of immigrants to the United States come from". 

Three other HCG items were identified, but each affected only one subgroup and in 
each case the statistic passed the cutoff for "C items by a relatively small amount. 
Reviewers did not identify how these items are unfairly related to subgroup membership. 

Given the number of items and group contrasts one has to conclude that there was 
little differential item functioning in the eighth grade NELS:88 battery. This happy 
result is probably due to the extensive pre-review of the items by both the ETS project 
development staff as well as the NCES staff. 



Factor Structure of the N ELS: 88 Eighth Grade Battery 

The factor structure of the NELS:88 battery was examined from two different 
complementary perspectives. These two perspectives were; 

• Convergent validity-This analysis addressed the question of whether or not 
items grouped by content into parcels would indeed define a common factor. 
For example, do four separately constructed mathematics item testlets consisting 
of arithmetic, algebra, geometry, and probability items respectively define a 
single mathematics factor? Similar content based item testlets were constructed 
as "factor markers" in each of the other three tested areas. 

Discriminant validity-This analysis complements the convergent validity 
question in that it examines whether or not the factors defined by their marker 
testlets have discriminant validity. That is, is a mathematics factor separable 
from a reading comprehension factor and also from a science factor, etc? 

The use of testlets to mark or define factors rather than individual items is advantageous 
since they (testlets) yield relatively continuous scores and are inherently more reliable 
than single items. 

This does not mean that other recently developed alternative methods using factor 
analysis of item responses (e.g. Bock, Gibbons, & Muraki, 1985) might not also be 
helpful here. While the Bock et al. Testfact program would in theory allow us to factor 
analyze at the item level, we have experienced considerable problems with convergence 
with item data sets of the size being analyzed here. An approximation to the Bock et al. 
factor solution at the item level is presented in a following section dealing with 
dimensionality at item response theory. 
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Five testlets, each one representing a different reading passage, were used to mark 
a potential reading comprehension factor. The five testlets were based on a literary 
passage, science passage, poetry passage, biographical passage, and a historical passage. 
Four testlets were assembled to mark a mathematics factor. The four mathematics 
testlets consisted of arithmetic, algebra, geometry, and probability items respectively. 
Similarly four marker testlets were assembled from the science items. These testlets 
were composed of earth science, life science, chemistry, and scientific method items 
respectively. Three HCG testlets were formed based on History, Citizenship/ 
Government, and Geography/ Economic development items respectively. 

The 16 testlets were analyzed using maximum likelihood procedures for the factor 
extraction stage. Four factors were then rotated to an oblique solution using the Prom ax 
procedure (Hendricksen & White, 1964). Table 4 presents the results of the exploratory 
factor rotation. The complete intercorrelation matrix of the 16 testlets appears in 
Appendix F. 

Inspection of Table 4 indicates that quite good simple structure was obtained for 
the reading, mathematics, and HCG testlets. That is, the testlets marking a reading 
factor, mathematics factor, and an HCG factor tended to have large loadings only on 
their respective factors. The science testlets, however, appear to be somewhat more 
complex and show salient loadings on the reading and mathematics factors. That is, the 
chemistry testlet loaded on the mathematics factor as well as on the science factor. 
Similarly, the life science testlet loaded to a certain extent on the reading factor in 
addition to its more salient loading on the science factor. This does not come as a 
surprise since the internal consistency reliability of the Science test was lower than was 
the case for the other tests. 

While the reading, mathematics, and HCG testlets demonstrated good convergent 
validity, the discriminant validity as measured by the factor inter-correlations was also 
reasonably encouraging. The correlation between reading and mathematics was .76 
which approximates that found in typical factor analysis of the SAT. One might expect 
somewhat higher correlations between the NELS:88 verbal and mathematics factors than 
for their SAT counterparts since the NELS:88 sample is considerably less subject to 
selection than the SAT sample. Generally the. factor correlations appear to vary little 
between the content areas and ranged from a low of .73 between Mathematics and 
History/Citizenship/ Geography and a high of .80 between History/Citizenship/ 
Geography and Science. 

It is expected that the correlations among these factors will be somewhat reduced 
as the students begin to sort themselves out into various curriculum tracks as they go on 
to their last four years of high school. 
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Table 4.--Factor structure, N£LS:88 tests 



PROMAX ROTATION 




r actor i 


r actor i 


factor a 




Keaa (literature) 




-.01 


,Uo 


1 1 

.11 


Keaa ^science ) 




.17 


.03 


1 1 
AD 


kcbo ^poeiryj 


.o<& 


•UO 




(Y? 


Read (biography) 


.77 


.00 


.03 


-.06 


Read (history) 


.64 


.03 


.02 


-.02 


/vninrneiic 


CO 


80 






Algebra 


.08 


.83 


.03 


-.06 


Geometry 


.00 


.33 


.02 


.02 


Probability 


-.02 




.03 


.11 


Earth Science 


.00 


.05 


.14 


.59 


Life Science 


.21 


.11 


.04 


.39 


Chemistry 


-.01 


.29 


.02 


.39 


Scientific Method 


.21 


.03 


.02 


.26 


History 


.04 


-.01 


.75 


.05 


Citizenship/Government 


.11 


.10 


.63 


-.02 


Geography/ Econ. Dev. 


.11 


.08 


37 


.19 


FACTOR INTERCORRELATIONS 




1 


2 


3 


4 


Factor 1 


1.00 








Factor 2 


.76 


1.00 






Factor 3 


.79 


.73 


1.00 




Factor 4 


.75 


.75 


.80 


1.00 



SOURCE: U.S. Department of Education, National Center for Education 
Statistics, NELS:88 Base Year Survey. 
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Performance of Racial /Ethnic and G e n der Groups on the NELS:88 Eighth Grade Test 
Battery 



Table 5 presents means and standard deviations on the NELS:88 eighth grade tests 
by racial /ethnic and gender groups. These means are based on Item Response Theory 
(1RT) scoring using the three parameter IRT model (Lord & Novick, 1968) and the test 
weights. The scores used in these computations are the number right "true" scores 
corrected for guessing. The column in Table 5 labeled as "SD-DIF"" presents the mean 
differences between the racial/ethnic subgroups and white majority group in terms of 
standard deviation units. Similarly the mean difference between male and female 
students on each of the tests is also presented in terms of standard deviation units. 

Inspection of Table 5 suggests that the mean differences in terms of standard 
deviation units between the non-Asian racial/ethnic groups and the White majority 
group is about the same magnitude as that which was found for the 1980 tenth grade 
HS&B sample. The eighth grade female students are doing somewhat better than the 
male students at reading and about as well in mathematics. At the same time, females 
are doing somewhat less well than the male students in both science and 
history/citizenship/geography. It would appear that as early as the eighth grade, female 
students are beginning to fall behind in science. 



Proficiency Level Sub scores by Subgroups 

In addition to providing scores for each of the four test content areas, behaviorally 
anchored proficiency level scores will also be reported in Reading and Mathematics. 
These proficiency level scores attempt to relate meaningful behaviors to various points 
on the total score scale. Three levels of mathematics proficiency and two levels of 
reading proficiency will be reported in addition to the usual normative scores for eighth 
graders. The three proficiency levels in mathematics form a hierarchical scale with each 
succeeding level characterized by increased complexity and where proficiency at a higher 
level implies proficiency at the lower levels. This Guttman scale property provides a 
limited amount of diagnostic information. The three mathematics proficiency levels 
define the following types of achievement: 

• Level 1- Students who are proficient at this level are able to successfully carry 
out simple arithmetical operations on whole numbers. 

• Level 2- Students who are proficient at this level have successfully mastered all 
the Level 1 tasks above as well as having mastered simple operations with 
decimals, fractions, and roots. 
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Table 5. -Weighted «eans and standard deviations of I8T scores on the NELS:88 tests, by racial/ethnic groups and sex 

TOTAL GROUP WjiTE ASIAN H I SPAN I C 8LACK 



AMERICAN INDIAN 





MEAN 


S.D. 


MEAN 


S.D. 


MEAN 


S.O. 


S0-DIF* 


MEAN 


S.D. 


SD-D!f* 


MEAN 


S.D. 


SD-D1F* 


MEAN 


S.D. 


SO-0IF* 


READING 


10.3 


6.0 


11.4 


5.9 


10.8 


6.2 


-0.1 


7.8 


5.5 


-0.6 


7.1 


5.3 


-0.7 


6.9 


5.2 


-0.7 


MATHEMATICS 


16.0 


11.3 


18.0 


11.0 


19.9 


12.2 


0.2 


11.0 


9.9 


•0.6 


8.9 


9.1 


-0.8 


9.4 


9.0 


-0.8 


SCIENCE 


9.9 


5.7 


10.9 


5.6 


10.6 


6.0 


-0.1 


7.5 


5.0 


-0.6 


6.3 


4.5 


-0.8 


6.5 


4.9 


-0.8 


HlST/Ci T/GEOG 


15.1 


7.6 


16.4 


7.2 


16.1 


8.2 


0.0 


11.6 


7.7 


-0.6 


11.2 


6.8 


-0.7 


10.5 


7.2 


-0.8 



HALE 



f EMAIE 





MEAN 


S.D. 


MEAN 


S.D. 


$D-DIf* 








READ 1 NG 


9.6 


6.1 


11.0 


5.9 


0.2 








MATHEMATICS 


16.1 


11.5 


15.9 


11*1 


0.0 








SCIENCE 


10.5 


6.0 


9.5 


5.4 


-0.1 








HIST/CIT/GEOG 


15.4 


7.9 


14.8 


7.3 


-0-1 


















NUMBER Of 


CASES 








UN I TE 


ASIAN 


HISPANIC 


BLACK 


AM.IND. 


MALE 


FEMALE 


READING 


15.756 


1,500 




3,005 


?,858 


308 


11,755 


11,887 


MATHEMATICS 


15,753 


1,495 




2,996 


2,860 


50/ 


11,750 


11,878 


SCIENCE 


15,758 


1,493 




2,995 


2,845 


307 


11,750 


11,865 


HIST/CIT/GEOG 


15,693 


1,487 




2,981 


2,84? 


508 


11,692 


11,832 



* Difference between subgroup mean and reference group mean in terms of the total group standard deviation. An associated negative sign indicates 
that the reference group (Whites for racial/ethnic comparisons; males for sex comparisons) had a higner mean. 
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• Level 3- Students who are proficient at this level have mastered the two lower 
proficiency levels and are able to successfully solve simple problem solving tasks. 
Unlike levels 1 and 2 which require the rote application of rules, performance at 
this leve quires conceptual understanding and/or the deve'.ipment of a solution 
strategy. 

Muyer, Larkin, & Kadine (1984), also present a hierarchical model based on four 
knowledge structures. However, their model emphasizes a hierarchy of cognitive 
processing skills which are most appropriate for mathematics tests such as the SAT-M 
which almost entirely emphasizes problem solving skills. Their four model components 
are factual/linguistic, algorithmic, schematic, and strategic. The eighth grade proficiency 
level model suggested here follows more of a learning or curriculum sequencing model 
than either the Mayer et al. model or a similar cognitive processing model developed for 
the SAT-M by Rock and Johnson (1989). A major feature shared, however, by the 
eighth grade curriculum sequencing model and the models espoused by Mayer et al. and 
Rock et al. is that the components are assumed to be sequentially dependent during 
problem solving. That is, for successfully implementing a schema the problem solver 
should have mastered the requisite factual /linguistic knowledge .iecessary to read the 
problem. 

In a primarily achievement oriented mathematics test such as the NELS eighth 
grade mathematics test, it was felt that the hierarchical dependencies should follow the 
typical learning or curriculum sequence. That is, Mastery of simple operations on whole 
numbers is a necessary but not sufficient condition for mastery of simple operations on 
decimals and fractions etc. As NELS proceeds through the upper grades it is likely that 
there will be fewer individual differences on the simple declarative or algorithmic 
knowledge and more between-individual variability on the problem solving skills. Thus, 
proportionately greater emphasis can be put on the development of problem solving 
skills in the succeeding followups. This does not mean that the simple declarative 
knowledge and algorithmic procedures will be missing from the tenth grade followup. In 
fact the hierarchically ordered skills irodel as presented here is particularly appropriate 
for the multi-level tes'lng procedure which is to be implemented at the tenth grade. 
Since the tenth grade multi-level forms are tailored to groups of students classified by 
their achievement levels (based on their eighth grade performance), the lower leve' 
forms will have a greater proportion of the simple algorithmic operations while the 
second and highest level forms will increasingly consist of items requiring conceptual 
understanding and production level problem solving skills. The hierarchical skill 
conception leads quite naturally to the multi-level testing model. 

Two kinds of proficiency score interpretations are available. The first kind of 
interpretation is consistent with the typical usage in the criterion referenced lite.ature 
(Glaser, 1963). It simply states whether or not a student is above or below a given 
threshold, e.g., Level 1 performance. A second interpretation has a more normative 
slant in that it gives the probability that a given student is proficient at a given level, say 
Level 1. Each student will have three mathematics proficiency probabilities-cne for each 
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of the three mathematics levels. Changes in an individual's proficiency probabilities as 
he or she goes from the eighth to the tenth grade indicate where on the development 
growth curve that individual is making progress. For example, an individual who 
increases his problem solving skills between eighth and tenth grade will show changes in 
the probability of being proficient at Level 3, but show little or no change in his or her 
probabilities of Level 1 or Level 2 proficiency. 

At this time, we will only present results on the criterion referenced type of 
interpretation. That is, we will report, for example, what percentage of a subgroup are 
proficient at Level 1 but have not mastered Level 2, and so on. Proficiency probabilities 
described in the second interpretation, which are most useful for measuring change over 
time, will be included in the presentation of results when grade 10 data are available. 

Each proficiency level is marked by a block of 4 items that are relatively internally 
consistent with respect to the cognitive processes required. For example, level one 
marker items all deal with simple arithmetical operations on whole numbers. In 
addition to requiring the same cognitive operations, the items within a particular 
"marker" block should exhibit similar item difficulty parameters. Since the underlying 
cognitive demand model is assumed to be hierarchical, students who are proficient on 
the level 3 block of marker items should also demonstrate proficiency on the level 2 and 
level 1 items. If a student demonstrates proficiency on a higher level block but not on a 
lower level block, one must infer that the hierarchical model did not fit that particular 
individual. While four items may seem like a relatively small number of items, it should 
be remembered that all four are essentially parallel measures of the same content or 
processing skill. The four items are not a subscale that attempts to discriminate 
individuals all along a continuous dimension but are simply used to make a "go/no go" 
decision at a certain point referencing a specific skill. Evidence for the internal 
consistency of the hierarchical model is the low rate of reversals in the response 
patterns. About 95% of the students in all the subgroups had response patterns to the 
marker blocks that were consistent with the hierarchical model. See Appendix G for a 
detailed description of the way in which the proficiency scores were defined. 

Figure 4 presents a proficiency profile of Racial/Ethnic groups on the 
mathematics test. It is clear from Figure 4 that there are relatively large group 
differences with respect to the type of problems that they can solve. Three-quarters 
(28% + 47%) of the eighth grade Hispanic students and nearly four-fifths (29% + 49%) 
of the Black students have not yet demonstrated proficiency with simple operations on 
decimals and fractions. Similarly, about 53% of the Whites and 44% of the Asians have 
yet to achieve proficiency in operations on decimals and fractions. The largest group 
differences occur at the most complex proficiency level which was defined by marker 
items requiring low level problem solving skills and/or conceptual understanding. The 
Asian students in particular are over represented at this proficiency level. 

Figure 5 presents the mathematics proficiency profiles for the two sex groups. 
Inspection of Figure 5 indicates quite similar proficiency profile for the male and female 
students. 
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Figure 4.— Percent of selected subgroups that are proficient 
each mathematics proficiency level 
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Figure 5.— Percent of gender groups that are proficient 
at each mathematics proficiency level 
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The two levels of proficiency that have been defined in the reading area are: 



• Level 1- Simple reading comprehension including reproduction of detail and/or 
the author's main thought. 

• Level 2- Ability to make inferences beyond the author's main thought and/or 
understand and evaluate relatively abstract concepts. 

Figure 6 presents a reading level proficiency profile for selected racial /ethnic 
groups. As in the case of Mathematics, there are considerable differences between the 
groups with respect to the various mastery levels. The percentage of Asian and White 
students who have demonstrated proficiency at the inference level is about double that 
of the Hispanic and Black students. 

Figure 7 presents the reading proficiency profile for the two sex groups. As in the 
case of mathematics, there is little difference between the patterns of proficiency for the 
sex groups at the eighth grade. 

Item Response Theory HRT) Parameters for the NELS;88 Battery 

As pointed out above, the multi-stage testing strategy requires both vertical 
equating and lateral equating. That is, forms that vary between grade (vertical equating) 
as well as forms that vary within grade (lateral equating) must all be put on the same 
scale. The most efficient way of accomplishing this is to use Item Response Theory 
(IRT) equating. The previously reported item statistics (including the estimates of 
internal consistency reliability) support the feasibility of IRT scoring and eventually IRT 
based equating for at least the mathematics, reading, and History/ Citizenship/ 
Geography tests. The following section provides further evidence of the relatively 
unifactorial nature of these three tests and thus their appropriateness for IRT 
applications. 

Tetrachoric correlations among items within a content area were estimated and 
corrected for guessing. Principal components analysis was performed on each of the 
content area tetrachoric matrices. One simple factor analytic measure of the relative 
unidimensionality of the content areas is the ratio of the first and largest component to 
the second component (Reckase,1979; Hulin, Drasgow, & Parsons,! 983). These ratios 
for reading, mathematics, science, and history/citizenship were 10:1, 12:1, 6:1, and 6:1. 
While all four show a single dominant factor, the reading and mathematics measures 
show a particularly dominant single factor. These results based on guessing-corrected 
tetrachoric matrices suggest that IRT estimation would provide reasonable estimates in 
all four content areas. 

While factor analytic or principal component methods provide some useful 
information on the unidimensionality of the respective item pools, Lord often argued 
that one should go ahead and compute the IRT parameters and then examine the 
discrimination indices and the item trace lines for lack of fit. A monotonically 
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Figure 6.- 
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Figure 7.— Percent of gender groups that are proficient 
at each reading proficiency 
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increasing trace line that comes close to the mean proportion correct for clusters of 
examinees grouped by ability level is evidence that the 1RT model is a good description 
for the item and the test. 

Appendices C1-C4 present the IRT item parameters for the reading, mathematics, 
science, and history/citizenship/geography eighth grade tests. The item parameters were 
computed using the Logist program (Wood et al., 1976). Item response theory (IRT) 
describes the probability of answering an item correctly as a mathematical function of 
ability level and characteristics of the items. The mathematical function used here, the 
logistic function, has one parameter for each individual's ability level and three 
parameters characterizing each item (Lord, 1980; Lord & Novick, 1968). The item 
parameters reflect difficulty level (b), discriminating power (a), and the likelihood of 
low ability individuals guessing the right answer (c). The function that relates the 
probability of passing a particular item i for a person of ability e in terms of the item 
parameters is: 

P(e) * c, + (1 - J (1) 

1 + exp [- Da(e ■ b)] 

where D = 1.7 

b, = item difficulty, corresponding to the value of 6 halfway between the guessing 

parameter and 1.0 

a, - discrimination parameter reflecting the steepness of the item characteristic curve 
at its point of inflection 

c, = "guessing parameter" probability of a person with very low ability getting the item 

correct 

e = a person's ability parameter usually standardized with mean 0 and standard 
deviation of 1.0 

and P(e) = probability of correct response of a person of ability level o. 

A person's number right true score (NRTS) is the simple sum of that particular 
person's Pj(e)'s. Thus the scoring weights each item receives in the summation to arrive 
at NRTS are a function of the interaction of the item parameters with the person's e or 
ability level. That is, the item characteristic functions, ?[e)\ provide a different score 
for a given item, depending upon a person's ability level. Inspection of the item 
characteristic function in equation (1) suggests that, for high ability people, the item 
score for a given ?tem..i will primarily depend on how much higher the person's o is 
compared to the item difficulty (b, also measured in e units), and how discriminating 
the item is. 

A low-ability person wll get little credit on a difficult item, even if he or she were 
to get it correct, because the model argues that the correct answer was probably 
guessed. This readily follows from equation (1). Such a person might have a 0 (ability 
level) that was negative, say -1.5, and the b f for a difficult item on the e scale might be 
2.0, and, since a, is always positive, the denominator of equation (1) would become large 
in relation to the numerator. The limit here as the denominator gets larger is a scoring 
weight Pj(e) equal to c, the guessing parameter. 
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The fact that the item scores that are summed to get the number right true score 
are a function of the person's ability level 9, discrimination, difficulty, and guessing 
parameters, suggests that IRT scoring can be beneficial if (1) people with low ability can 
get the right answer by guessing; (2) items in the test vary in both difficulty and 
discrimination and thus an optimal scoring procedure should take this into account; (3) 
there are test center administration irregularities with respect to directions or timing that 
may lead to varying levels of items attempted and (4) the purpose is to put tests that 
share some but not all of the same items on the same scale. 

Infection of appendices C1-C4 indicate that only one item had a discrimination 
index ("a" parameter) in the thirties. This was" a reading item (item 10) which had a 
difficulty parameter ("b") of 1.7, indicating that it was relatively difficult. The item was 
classified as requiring an inferential cognitive step. This item's biserial was in the forties 
(Appendix Al) suggesting that it may be reasonably reliable from the traditional 
psychometric viewpoint. 

The summary statistics at the bottom of each column give the mean and standard 
deviation for each test's item parameters. In three out of four of the tests, the average 
discrimination parameter was greater than unity. In the 4th test, science, tbc average 
discrimination was only slightly less than unity ( .98). Item discrimination parameters 
1.0 and above are considered very good. Further investigation of the residuals for each 
item trace curve (not shown here) suggest that the IRT model fit quite well in reading, 
mathematics, history/citizenship/geography, and was reasonably acceptable in science. 

With respect to both the skewness of the estimated theta distribution and the 
estimation of item parameters on the unweighted sample, Yamamoto (1990) has carried 
out empirical studies comparing weighted and unweighted, and skewed vs. unskewed 
theta distributions for both BILOG and LOG 1ST IRT estimation. His preliminary 
results suggest that there is bias in both the A and B parameters but LOG 1ST seems 
mo r e robust when either the normality assumption is violated and /or the unweighted 
sa iple is used to estimate the IRT parameters. In spite of the fact that there may be 
differences in IRT parameters for vaiious weightings /skewnesses, differences in theta 
means among various subgroups remain relatively invariant over violations of normality 
assumptions in the theta distributions and/or the use of weighted or unweighted 
samples. Work being carried out for NAEP may provide more information about this 
issue in the future. 

Appendices D-l through D-4 present test information functions for each of the 
tests. The information function is a simple transformation of the standard error of 
measurement: it is the reciprocal of the square of the SEM. Since it is impractical to 
present standard errors of measurement for each point in the score scale, the plot 
represents a picture of the estimated accuracy of measurement along the entire ability 
range. A high point on the plot corresponds to greater accuracy. For each of the four 
tests, the information function is above 1.0 for the ability range -2.0 to +2.0 (which 
includes more than 90% of the students), indicating a standard error of measurement of 
less than one score point in that range. 
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Test Scores on User Tape 



The user tape of NELS:88 base year data available from NCES contains a variety 
of formulations of the test scores for the convenience of analysts. For each of the four 
cognitive tests, number of correct answers, number of wrong answers, and number of 
items omitted are included. A formula score for each test consists of the number right 
minus a proportion of the number wrong, and represents an effort to correct for score 
differences that are attributable to different response styles with respect to guessing, 
rather than to differences in knowledge of the correct answers. That is, one student may 
have a tendency to guess at random if he or she does not know the answer to a 
question, while another will simply leave the item blank. For four-choice test items, the 
expectation is that one fourth of the random guesses are likely to be correct, thus raising 
the number-right score for the student who chooses to guess over that of a student of 
equal ability who omits unknown items. The guessing correction subtracts a proportion 
of the wrong answers from the number right, with the proportion depending on the 
number of answer-choices for the items. In the case of four-choice items, again, the 
assumption is made that random guessing will produce approximately one-fourth correct 
answers and three-fourths wrong. So subtracting one-third of the incorrect answers from 
the number right produces an estimate of t*v; score that would have been attained by 
another student of equal ability who chose to omit items instead of guessing. 
Computation of formula scores on the user tape took into account the number of answer 
choices for each incorrect item, that is, by subtracting l/(n-l) for each wrong answer, 
where n is the number of response options. Omitted items are not treated as wrong, 
and do not er'er into computation of formula scores. 

IRT number-right scores, as discussed in detail in the section on IRT earlier, 
represent the sum of the probabilities of correct answers on each of the items in the 
test, given an individual's overall ability level. The IRT formula score on the user tape 
is a transformation of this score, in which a correction is made for the probability of an 
incorrect response, 1-P, , on each item. The correction factor, (l-P)/(n-l) for each item, 
is subtracted from the IRT number-right score. While this U not necessary as a 
correction for guessing, since the possibility of guessing is already compensated for in 
the IRT model, the IRT formula score is preferred by some researchers since it more 
nearly approximates the range, mean, and variance of the raw formula score metric. 

The final scores included in the NELS:88 user tape are standardized scores for 
each test, with each content area scaled to an estimated national mean of 50 and 
standard deviation of 10. This is accomplished by simply subtracting the weighted 
overall mean from each raw formula score, dividing by the standard deviation, 
multiplying by 10, and adding 50. Analysts find this formulation useful because it 
provides a convenient framework for comparison of individual or subgroup scores with 
national averages. For example, a subgroup average of 55 in standardized units 
represents an achievement level half a standard deviation higher than the national 
average. The standardized composite on the user tape is the average of the reading and 
mathematics standardized scores. 
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Quartile scores based on the raw formula score for each content area, as well as 
for the standardized composite, are included on the tape. These simply break each 
weighted score distribution into fourths, and are included for the convenience of users 
who require a simple way of dividing the sample by achievement Jevel. 

Approximately 4% of the 24,599 students who completed questionnaires did not 
have test scores. There were several reasons for missing test scores: (1) In some cases, 
initial parent refusal to let the student participate was turned around when the parent 
was recontacted for the parent survey in the summer. In such cases, students were 
interviewed by telephone, but no tests were administered. (2) Several schools refused 
the test component of the survey because of the time burden but agreed to do the 
student questionnaire. (3) In school-administered makeup days, typically only the 
student questionnaire was administered. (4) Some materials were lost in transit. In 
some of these cases the questionnaire was then administered by telephone, but not the 
test. (5) Some of the students were present for the test administration but failed to 
answer items in one or more sections of the test. Test sections were not scored if fewer 
than five items were answered. Special sample weights adjusted for test nonresponse 
were used for analyses in this report, and differ in this respect from the basic student 
weight (BYQWT) on the public use tape. 
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CHAPTER 4. CONCLUSIONS 



The results suggest that for the most part the NELS;88 eighth grade test battery 
either met or exceeded its psychometric objectives. While the allotted testing time was 
only about one and a half hours, quite acceptable reliabilities were obtained for the 
Reading Comprehension, Mathematics, and the History/ Citizenship /Geography test. In 
fact, the NELS:88 battery reliabilities significantly exceeded their counterparts in the 
previous HS&B test battery. 

These internal consistency reliabilities were sufficiently high to justify the use of 
Item Response Theory (IRT) scoring, and thus, provide the framework for constructing 
follow-up forms that will be more adaptive to the ability level of the student. The IRT 
scaling will enable the researcher to administer forms varying in difficulty (at the tenth 
grade) depending on the student's previous (eighth grade) achievement scores in the 
areas of Reading, Mathematics, and possibly History/Citizenship/Geography. This 
adaptive approach will both minimize potential ceiling effects when the students are 
followed up as tenth graders, and it will also help to increase measurement accuracy. 

The Science test w ; considerably less unifactorial than the other tests. This 
finding poses less of a problem .n the Science area since there appears to be little 
possibility of ceiling effects at least up to and including the tenth grade. Thus, there 
appears to be little need for a tenth grade form that is adaptive. 

There was little evidence of differential item functioning (D1F) for either gender 
or racial /ethnic groups. 

Factor analytic results supported the discriminant validity of the four content 
areas. Convergent validity was also indicated by the salient loadings of the testlets 
composed of "marker items" on their hypothesized factors. 

In addition to providing the usual normative scores in all four tested areas, 
behaviorally anchored proficiency level scores are available in both the Reading and 
Mathematics areas on the NELS:88 public release tapes. 
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Appendix A-l 
Item Analysis Statistics, Reading 



TOTAL 



Co 



ITEM 1 
ITEM 2 
ITEM 3 
ITEM 4 
ITEM 5 
ITCH 6 
ITEM 7 
ITEM 6 
ITEM 9 
ITEM 10 
ITEM 11 
ITEM 12 
ITEM IS 
ITEM 14 
ITEM 15 
ITEM 16 
ITEM 17 
ITEM 18 
ITEM 19 
ITEM 29 
ITEM 21 
COLUMN MEAN 
COLUMN S.O. 

SAMPLE SIZE 
POPULATION ESTIMATE 

COEFFICIENT ALPHA 
SPLIT HALF RELIABILITY 



FORMULA SCORE 
NUMBER RI6MT 
NUMBER WONG 
NUMBER OMITS 
NUMBER NOT REACHED 



P» 
9.95 
9.65 
8.02 
9.57 
9. 55 
9.69 
0.41 
9.49 
0.61 
0.39 
0.59 
0.71 
0.50 
0.46 
0.46 
0.76 
0.53 
0.54 
0.63 
0.70 
0.62 
0.61 
0.14 



RBIS 
0.59 
9.62 
9.65 
0.66 
0.67 
0.65 
0.63 
0.66 
0.56 
0.45 
0.65 
0.76 
0.55 
0.65 
0.70 
0.74 
0.67 
0.53 
0.66 
0.64 
i^Z 
0.64 
0.07 



23679 
3005290 

0.64 
0.65 



OELTA 
6.5 
6.6 
9.3 
12.3 
12.5 
12.0 
13.9 
13.1 
11.9 
14.1 
12.1 
10.6 
13.0 
13.2 
13.9 
10.1 
12.7 
12.6 
11./ 
10.9 

JOA 
11.7 
1.6 



10.2 
12.6 
8.0 
0.2 
0.2 



5.D. 
6.16 
4.81 
9.64 
0.65 
1.26 



P* 
0.93 
0.65 
9.60 
0.53 
0.53 
0.61 
0.39 
0.46 
0.56 
0.36 
0.54 
0.66 
0.52 
0.45 
0.43 
0.73 
0.49 
0.51 
0.59 
0.67 
0.60 
0.56 
0.14 



RBIS 
O-aO 
0.61 
0.63 
8.65 
0.62 
0.68 
0.64 
0.66 
0.55 
0.50 
0.65 
0.75 
0.56 
0.64 
0.70 
0.75 
0.64 
0.51 
0.65 
0.63 
0.59 
0.63 
0.06 



11689 

1495064 

0.84 
0.85 



OELTA 
7.0 
6.9 
9.7 
12.7 
12.7 
11.9 
14.1 
13.2 
12.4 
14.2 
12.6 
11.4 
12.6 
13.5 
13.7 
10.5 
13.1 
12.9 
12.0 
11.3 

12.0 
1.7 



9.5 
12.1 
8.4 
0.2 
0.3 



6.21 
4.85 
4.68 
0.69 
1.42 



9* 

0.96 
0.66 
0.85 
0.62 
0.57 
0.60 
0.42 
0.50 
0.66 
0.40 
0.63 
0.76 
0.49 
0.50 
0.49 
0.79 
0.57 
0.56 
0.66 
0.74 
0.64 
0.63 
0.15 



snis 

0.56 
0.64 
0.67 
0.66 
0.71 
0.63 
0.62 
0.70 
0.57 
0.39 
0.63 
0.7S 
0.56 
0.65 
0.70 
0.73 
0.69 
0.55 
0.70 
0.65 
0.65 
0.64 
0.06 



11814 
1491180 

0.83 
0.65 



DELTA 
5.9 
6.7 
8.9 
11.6 
12.3 
12.0 
13.6 
13.0 
11.3 
14.0 
11.6 
10.2 
13.1 
13.0 
13.1 
9.6 
12.3 
12.4 
11.4 
10.4 

11*5 
11.4 

1.9 



mm 

10.9 
13.2 
7.5 
0.2 
0.2 



S.O. 
6.03 
4.70 
4.54 
0.61 
1.07 



Source: U.S. Depar'nent of Education, National Center for Education Statistics, National Education 
Longitudinal Study of 1988: Base Year Survey . 
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Appendix A-l— (continued) 
Item Analysis Statistics, Reading 



ITEH 1 

ITEM £ 

ITEM 3 

ITEH 4 

1TEH 5 

ITEM 6 

XTEH 7 

ITEM a 

XTEH 9 

XTEH 10 

XTEH 11 

XTEH 12 
XTEH IS 
XTEH 14 
XTEH IS 
XTEH 16 
ITEH 1? 
XTEH 16 
XTEH 19 
XTEH 20 
ITEH 21 
COlUrti HE AH 
COlUHN S.D. 

SAMPLE SIZE 
POPULATION ESTIMATE 

COEFFICIENT ALPHA 
SPLIT HALF RELIABILITY 



FORMULA SCORE 
NUHBCff RIGHT 
NUMBER WONG 
NUMBER OMITS 
NUMBER NOT REACHED 



0.95 
0.65 
0.82 
0*57 
0.55 
0.69 
9.41 
0.** 
0.61 
9.39 
0.59 

o.n 

0.50 

o.4a 

0.46 
0.76 
0.53 
0.54 
0.65 
0.70 

0.61 
0.14 



RBIS 

0.59 

0.62 

0.65 

9.66 

0.67 

0.65 

0.65 

0.60 

0.56 

0.65 

0.65 

0.76 

0.55 

0.65 

0.70 

0.74 

0.67 

0.53 

0.60 

9.64 

0.64 
0.07 



DELTA 
6.5 

a.e 

9.5 
12.3 
12.5 
12.0 
13.9 
13.1 
11.9 
14.1 
12.1 

10. a 

13.0 
13.2 
13.4 
10.1 
12.7 
12.6 
11.7 
10.9 

11. a 
11.7 

i.a 



23679 
3995290 

9.84 
9.85 

19.2 6.16 

12.6 4.61 

8.9 4.64 

9.2 0.65 

9.2 1.26 



0.95 

0.85 

0.89 

9.56 

9.54 

0.63 

0.43 

9.54 

9.66 

9.43 

9.64 

9.70 

9.54 

0.52 

9.51 

9.79 

9.57 

9.56 

9.65 

9.74 

9.63 
9.13 



RBIS 

0.70 

0.66 

0.79 

9.62 

9.69 

9.71 

9.69 

9.71 

0.51 

9.45 

9.64 

9.77 

0.62 

0.70 

9.72 

9.71 

9.64 

9.51 

9.69 

9.63 

9.68 

9.65 

9.98 



1590 
105759 

9.85 
8.07 



OELTA 
6.6 
8.9 
9.6 
12.4 
12.5 
11.7 
13.7 
12.6 
11.3 
13.7 
11.6 
19.9 
12.6 
12.8 
12.9 
9.8 
12.3 
12.4 
11.4 
10.5 

ills 

1.7 



fiEAN 
10.6 
13.1 
7.5 
0.2 
0.2 



6.28 
4.91 
4.74 
0.57 
1.16 



P+ 
0.93 
9.89 
9.75 
9*46 
9.41 
9.49 
9.29 
9.36 
9.55 
9.34 
9.54 
9.61 
9.43 
9.37 
9.36 
0.67 
9.39 
9.48 
9.52 
9.63 
9.50 
9.52 
9.16 



HISPANIC 



RBIS 

0.54 

0.58 

9.61 

0.64 

0.63 

0.61 

0.55 

0.66 

0.54 

0.45 

0.55 

0.68 

0.44 

0.53 

0.64 

0.66 

0.54 

0.47 

0.56 

0.57 

0.57 
0.07 



3993 
304711 

9.79 
9.61 



OELTA 
7.2 
9.7 
19.4 
13.4 
13.9 
13.1 
15.2 
14.4 
12.5 
14.6 
12.6 
11.9 
13.7 
14.3 
14.4 
11.3 
14.2 
13.2 
12.6 
11.7 

UJ 

12.7 
1.9 



7.7 
19.7 
9.7 
9.2 
0.4 



S.O. 
5.63 
4.44 
4.26 
9.76 
1.66 



P* 
9.93 
0.75 
0.73 
0.38 
0.45 
0.44 
0.26 
0.35 
0.51 
0.32 
0.46 
0.52 
0.38 
0.37 
9.36 
0.65 
9.49 
9.45 
9.45 
9.57 
9.48 
9*49 
9.16 



RBIS 

9.49 

9.55 

9.58 

9.62 

9.69 

9.55 

9.52 

9.62 

9.53 

9.49 

9... 

9.66 

9.38 

9.54 

9.69 

9.66 

9.49 

9.52 

9.58 

9.55 

9.55 
9.98 



DELTA 

7.1 
19.2 
19.5 
14.2 
13.6 
13.6 
15.6 
14.5 
12.9 
14.9 
15.4 
12.6 
14.2 
14.3 
14.5 
11.4 
14.9 
13.5 
13.5 
12.3 

ILi 
13.9 
1.9 



2071 
391769 

0,77 
0.89 

6.9 5.43 

19.9 4.26 

19.2 4.26 

9.3 9.63 

9.6 2.93 



P+ 
9.95 
0.86 
0.85 
0.61 
0.59 
0.65 
0.45 
9.54 
0.64 
0.42 
0.62 
0.76 
0.54 
0.51 
0.50 
0.69 
0.58 
0.56 
0.67 
0.74 

1A2 
0.65 
0.14 



RBIS 

0.63 

0.62 

0.64 

9.64 

0.66 

6.64 

0.62 

0.66 

0.57 

0.44 

0.66 

0.76 

0.58 

0.67 

0.70 

0.76 

9.69 

9.53 

9.68 

9.66 

aai 

9.64 
9.97 



DELTA 
6.2 
6.2 
6.8 
11. 7 
12.9 
21.4 
13.5 
12.6 
11.6 
13.8 
11.6 

10.2 

12*6 

12.9 

13.9 
9.6 

12.2 

12.4 

21.2 

19.4 

11.3 
1.6 



AHZRICAM US>m. 



15771 
2129481 

9.63 
9.84 

tSJSi UL. 

11.3 6.90 

13.5 4.65 

7.2 4.53 

9.2 9.50 

0.1 0.90 



P+ RBIS 

9.95 0.35 

9.72 8.53 

0.72 0.67 

0.45 0.59 

0-36 0.61 

0.45 0.66 

0.26 0.59 

9.33 9.76 

9.50 0.42 

0.29 0.51 

0*46 0.53 

0.56 0.73 



0.35 
0.34 
0.34 
0.60 
0.42 
0.36 
0.46 
0.59 

0.46 
0.16 



0.35 
0.52 
0.62 
0.70 
0.41 
0.54 
0.53 
0.56 

2*K 
0.56 
0.11 

306 

43293 

0.76 
0.78 



DELTA 

6.4 
10.7 
10.7 
13.5 
14.4 
13.5 
15.6 
14.6 
13.0 
15.2 
13.2 
12.4 
14.6 
14.6 
14.6 
12.9 
13.9 
14.5 
13.4 
12.0 
13.? 
13.2 
2.0 



em 

6.7 
9.9 
19.5 
9.4 
9.3 



S.D, 
5.52 
4.34 
4.24 
1.09 
1.29 



Source: U.S. Department of Education, National Center for Education Statistics, 
Longitudinal Study of 1988: Base Year Survey . 
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Appendix A- 1— (continued) 
Item Analysis Statistics, Reading 



HISPANIC HALE 





p* 


ITCH 1 




ITEM 2 


fi TO 


ITEH 3 




ITEM 4 




item 5 


A 4kl 


ITEM 6 


fi U 




A 9A 


ITEM fi 


a m 


ITEM 9 


A CI 
«•» 


ITEM 10 


fi XX 


TTEN 11 


A C9 


ITEM 1? 


A CA 
V ■ 90 


ITEM 13 


0.45 


ITEM I* 


0.17 


ITEM 15 


0.34 


ITEM 16 


0.67 


ITEM 17 


0.37 


ITEM IS 


0.46 


ITEM 19 


0,51 


ITEM 20 


0.56 


ITEM 21 


0.50 


COLUMN WAN 


0.50 


COiltt! S.0. 


0.16 



SAMPLE SIZE 
POPULATION EST tf <ATE 

COEFFICIENT AIWA 
SPLIT HALF RELIABILITY 



FORMULA SCORE 
NUMBER RIGHT 
NUHBER WONG 
NUMBER OMITS 
NltSER NOT REACHED 



mis 

0.56 
0-57 

o.sa 

0.63 
0.59 
0.6* 

o.sa 

0.66 
0.55 
0.53 
0.52 
0.66 
9.66 
0.53 
0.66 
0.66 
0.56 
0.42 
0.54 
0.59 
0-51 
0.57 
0.07 

1437 
151316 



0.79 

o.eo 

flEAN S.P, 

7*3 5.61 

10,4 4.43 

9.9 4.25 

ft. 3 0,79 

0.4 1.75 



DELTA 
7.4 
9.0 
10.7 
13.0 
13.9 
13.0 
15.4 
14.4 
12.9 
14.7 
12.0 
12.4 
13.5 
14.3 
14.7 
11.2 
14.3 
13.4 
12.9 
12.2 

UJ 

12.9 
1.6 



HISPANIC FEHALF 

P* R8IS DELTA 

9.94 0.52 6. • 

0.00 0.56 9.6 

0.77 0.64 10.0 

9.49 0.64 13.1 

0.41 0,67 13.9 

0.46 0.56 13.2 

9.30 0.53 15.1 

0.37 0.67 14.4 

0.56 0.54 12.2 

0.35 0.37 14.5 

0.55 0.56 12.4 

0.66 0.69 11.3 

0.41 0.43 13.9 

0.36 0.53 16.3 

6.3)» 0.62 14.2 

0.66 0.67 11.3 

0.40 0.53 14.0 

0.50 9.51 15.0 

0.52 0.56 12.6 

0.66 0.55 11.2 

l*£3 UJ 

0.53 0.57 12.6 

0.16 0.06 1.9 

1545 
151394 



0.79 
0.61 



ma 

6.1 
11.0 
9.5 
0.2 
0.4 



S.P. 
5. 61 
6.42 
4.24 
0.73 
1.62 



BL-OC HALF 

miS DELTA 
0.91 0.46 7.6 

0.75 9.54 10.4 

0.71 0.57 10.7 

0.34 0.61 14.6 

0.43 0.56 13.7 

0.45 0.57 13.5 

0.26 0.50 15.5 

0.35 0.59 16.6 

0.45 0.46 13.5 

0.30 0.46 15.1 

0.41 0.60 13.9 

0.47 0.66 13.3 

0.41 0.63 13.9 

0.34 0.54 16.6 

0.33 0.71 14.7 

0.62 0.66 11.6 

0.36 0.44 16.4 

0.43 0.51 13.7 

0.43 0.52 13.7 

0.54 0.53 12.6 

ft*4S ii^ 

0.46 0.54 13.3 

0.16 0*06 1.6 

1366 
191961 

0.76 
0.79 



6.2 
9.4 
10.5 
0.3 
4.6 



S.D t 
5.31 
4.21 
4.24 
0.79 
2.37 



BLACK FFHALF 

P* DDIS DELTA 

0.95 0.50 6.4 

0.77 0.56 10.1 

0.75 0.56 10.3 

0.42 0.62 13.6 

0.46 0.66 13.4 

0.44 0.54 13.6 

0.26 0.56 15.6 

0.36 0.66 14.4 

0.57 0,57 12.3 

0.34 0.32 14.7 

0.51 0.52 12.9 

0.57 0.65 12.3 

0.34 0.35 14.4 

0.40 0.53 14.0 

0.36 0.67 14.2 

0.66 0.63 11.1 

0.44 0.54 13.6 

0.46 0.52 13.4 

0.47 0.64 13.3 

0.61 0.56 11.9 

4*31 0x55 Ux5 

0.51 0.56 12.6 

0.16 0.09 2.0 

1466 
197273 

0.76 
0.60 



WITE HALF 



7.5 
10.5 
9.6 
0.3 
0.4 



1L 

5.46 
4.29 
4.26 
0.64 
1.63 



P* RSIS DELTA 

0.94 0.65 6.7 

0.66 9.61 6.3 

0.63 0.62 9.2 

0*59 0.64 12.1 

0.57 0.62 12.3 

0.66 0.67 11.3 

0.44 0.64 13.6 

0.52 0.65 12.6 

0.56 0.55 12.2 

6.41 0.49 13.9 

0.57 0.66 12.3 

0*71 0.76 10.6 

0.55 0.56 12.5 

0.46 0.66 13.2 

0.46 0.70 13.4 

0.77 0.77 10.1 

0.54 0.66 12.6 

0.54 0.51 12.6 

0.64 0.67 11.6 

0.70 0.66 10.9 

£x&& Hill ILlS 

0.62 9.64 11.6 

0.14 0.07 1.6 



7631 
1061031 

0.64 
0.65 

HE AN S.P. 

10.5 6.12 

12.9 6.75 

7.6 6.63 

0*2 0.63 

0.2 1.02 



WHITE FEHALF 

P* RBIS DELTA 
0.97 0.56 5.6 
0.69 0*63 6.1 
0.66 0*66 6.4 

0.67 0.64 11. 2 

0.62 0.70 11.6 

0.65 0.62 11.5 

0.47 0*60 13.3 

0.55 0.66 12.5 

0.69 0.57 11.0 

0.42 0.39 13.6 

0.67 0.64 11.3 
0.61 0*75 9.4 

0.53 0.59 12.7 

0.54 0.67 12.6 

0.53 0,70 12.7 
0.63 0.74 9.3 

0.62 0.72 11.6 

0.59 0.55 12.1 

0.71 0.69 10,7 
0.76 0.66 9.9 

tiS U5 UJ 

0.67 0.66 11.0 

0.14 0.06 1.9 

7627 
1055764 

0*63 
0.64 



HE AM 
12.0 
14.1 
6.7 
0.1 
0.1 



S.P. 
5.76 
4.47 
4.37 
0.53 
0.75 



Source: U.S. Department of Education, National Center for Education Statistics, 
Longitudinal Study of 1988: Base Year Survey . 
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Appendix A-2 
Item Analysis Statistics, Mathematics 
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ITCH 
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4.47 


11. f 


ITW 


SI 
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9.14 


11*1 


4.44 


4*44 


11.1 


nw 


11 


9.4? 


9.19 


11.1 


4.44 


4*11 


11. t 


net 


14 


• .91 


•.17 


If. 9 


4.44 


4.44 


11.4 


XTfH 


IS 


4.44 


ft. 49 


It .4 


4.14 


4.41 


lt.1 
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14 


4.44 


• •41 


14.4 


4.4t 


4.44 


11.4 


ITEIf 


17 


•.41 


••49 


11.4 


•.47 


4.74 


11.1 


XTfN 


14 


9.4t 


9.U 


11.4 


• .44 


4.11 


11.4 


ITfH 


If 


4.14 


4.79 


14.1 


9.14 


4.47 


I4.f 


11 219 


44 
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Ul 


UJ 


4.14 


1*40 


UJ 


COU»« MI AM 


•.14 


•.Id 


If .1 


4.14 


4.44 


It. 4 
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•.11 


•.11 
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4.11 


1.1 
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1400149 

4.94 
4.40 

ma aus* 

14.4 11. If 
11.4 4.71 
17.4 4.14 
4.4 f<*4 
4.1 1.47 



1491774 

4.44 

4.91 

mm 

14.1 11.44 
tl.7 4.44 
17.1 4.44 
4.4 f.14 
4.1 1.44 



4.71 
4.44 
4.44 
4.44 
9.14 
4.44 
4.41 
4.17 
9.4f 
4.41 
4.11 
4.44 
4.41 
4.47 
4.7t 
4.74 
4.74 
9. If 
4.41 
4.4t 
4.44 
4.47 
9.44 
4. 89 
4.49 
4.41 
4.49 
4.49 
4.41 
4.44 
4.44 
4.47 
4.41 
4.41 
4.44 
4.14 
4.44 
4.44 
4.34 

tutS 

4.44 
4.11 



■414 
4.44 
4.44 
4.14 
4.41 
4.44 
4.44 
4.74 
0.41 
4.41 
4.44 
9.44 
4.47 
4.44 
4.49 
4.49 
4.44 
9. At 
4.44 
4.44 
4.44 
4.41 
4.71 
4.44 
4.44 
4.44 
4.44 
4.77 
4.47 
4.44 
4.41 
4.47 
4.44 
4.14 
4.A4 
4.44 
4*41 
4.49 
4.14 
4.71 

tatt 
4.47 
4.11 



Of IT A 
14.4 
11.1 
11.1 
11.1 
11.4 
11.4 
11.4 
14.1 
11.9 
11*9 
14.7 
lft.A 
It. 4 
11.1 
14.7 
4*4 
14.4 
It. 4 
9.4 
4.1 
11.4 
U.l 
11.1 
U.l 
lt.9 
11.9 
lt.1 
lf.1 
11.7 
11.4 
It.t 
11.1 
11.1 
11.7 
If .4 
14.1 
11.4 
14.4 
14.1 

itti 

1.4 



11441 
1449411 

4.94 
4.44 

K&f l at a 
14.4 11.14 
t:.4 4.44 
17.4 4.11 
4.7 t.17 
4.1 1*14 



Source: U.S. Department of Education, National Center for Education Statistics, National Education 
Longitudinal Study of 1988: Base Year Survey . 
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Appendix A-2— (continued) 
Item Analysis Statistics, Mathematics 
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ft.ft t»l* 

ft.t l.ft7 



ft. 77 

• .ftft 
ft. ftl 
ft.ft* 
ft.ftl 
ft. ftft 
ft. ftft 
ft.ftl 
ft.ft* 
ft.ftt 
ft.ftl 

• .ftft 

• .ftl 
ft. ftl 
ft. 7ft 
ft.ftt 
ft. 71 

• .ftl 

• .ftft 

• .Oft 

• .ftft 
ft.Tft 
••ft* 
ft.ftl 
ft.Tft 
ft.ft* 

• .47 

• .ftl 

• .ftft 

• .ft* 
ft. ftft 
ft. 71 
ft.ftl 
ft. ftft 
ft.ftl 
ft.ftl 
ft. 47 
ft.ftft 
ft.ftl 

l**f 
ft.ftl 
ft. 11 



AM* 
ft.Tft 
ft.ft* 
••17 
ft.Tl 
ft.Tft 
•.ft* 
ft.Tft 
ft.ft? 
ft ftl 
ft.ftt 
ft.ftl 
ft.Tt 
ft.Tft 
ft.Tt 
ft.ftft 
ft.ftft 
ft.ftt 
ft.Tft 
ft.ftft 
ft.»7 
• 37 
••79 
ft.l* 
ft.ftft 
ft.ftft 
ft.Tt 
ft.ftl 
ft.ftft 
ft.Tft 
ft.ftl 
ft.Tt 
ft.ftft 
ft.ftft 
ft.ftft 
ft.ftft 
ft.Tft 
•.ftft 
ft.ftft 
ft.Tft 

tai 

•.♦ft 

ft.U 

149ft 
IftftUl 

ft.ftt 
ft.ftl 



tSJH flJZ* 

19.7 U.tl 
24. ft 9.45 
14.7 ft.ftft 
ft.7 t.lT 
ft.t 1.5* 



0UTA 
I*.* 
U.ft 
U.ft 

lf.1 

U.ft 

11.1 
ll.ft 

U.7 
U.ft 
U.ft 
11.7 
It. ft 
U.7 
11.9 
Ift.f 
ft.ft 
lft .ft 
U.7 
ft.ft 
ft.* 
U.ft 
1ft. 1 
U.f 
U.7 
Ift.ft 
U.ft 

11.1 

11.9 
U.ft 

u.t 
ll.ft 
1ft .« 

If .9 
U.ft 
U.ft 
U.ft 

11.1 

11.4 
U.ft 

utft 

1.1 



ft.ftft 
t.lft 
ft.ftt 
ft. 1ft 
ft.ftl 
ft. 1ft 
ft.t* 
ft.t* 
ft. 1ft 
ft.t* 
ft.M 

ft.ftl 

ft.S* 

ft.ft* 

ft.7* 
ft.ftl 
0.41 
ft.Tt 
ft.Tft 
ft.»7 
ft.ftft 
ft. lft 
ft.** 
ft.ftl 
ft.ft* 
ft. ft7 

• .♦4 
ft 
ft 

• .47 

• .ft* 
ft.ftt 
ft.ftl 
ft.ftt 
ft. lft 
ft.tft 

• .19 
ft.tft 

ft.ftft 
ft. 11 



tftlft OtiTft 

••ft* U.ft 

ft.ftl U.t 

ft.tl U.t 

ft.ftt Ift.ft 

ft.*? 14. • 

• .11 Ift.ft 
ft.ftl IS. I 
ft.ftft Ift.ft 
9.9ft Ift.ft 
ft.ftft lft. 1 
ft.ftt lft.1 
ft.ftl U.ft 
ft.ftl 11.9 
9.49 U.f 
••♦ft ll.ft 
ft.ftft lft.1 
ft.ft* 11. • 
ft.ftft ll.ft 
ft.ftl lft. 7 
ft.ftl 19.1 
ft.ftft lt.1 
ft.ftft If.* 
ft.ftft U.ft 
ft.ftft ll.ft 
ft.ftft It. 7 

• .♦ft U.ft 
ft.ft? 11. 1 
•.ft* ll.ft 
ft.ftft ll.ft 

ll.ft 

ft.ftl 11.1 

ft.ft? U.ft 

ft.lt ll.ft 

ft.ftft 11.7 

ft. 19 11.1 

ft.ft? Ift.ft 

ft. 7ft Ift.f 

ft.t? 14.1 

ft.ftl Ift.ft 

♦.ii ii.* 

ft.U 1.4 



44 ft.ftft 



fftftft 
SftUftl 

ft.ftft 
ft.** 



ft.ftft 

ft.n 
ft.ft* 
ft. ift 
ft.u 
•.># 
ft.t? 
ft.ti 
•.» 
ft.t* 
ft.tt 

ft.i* 

ft. 17 
ft.ftft 
ft.Tft 
ft.ftl 
ft.l* 
ft. 71 
ft.7t 
ft.ftl 

• .4ft 

• .ftl 
ft.ftl 
ft. lft 

• .♦ft 
ft. lft 
ft. 17 
ft.l? 
ft.ftl 
ft. IT 
ft.ftl 
ft.l? 
ft. lft 
ft.ftft 
ft.tft 

• .11 

• .17 
O.fl 
1*12 
•.♦1 

ft.U 



U.l 9. Aft 

17.* 7.71 

tft.9 7.97 

1.9 t.ftt 

ft.ft t.lT 



fiftlft 
ft.ftl 
ft.ftl 
ft.tl 
ft.ft* 
ft.S* 
ft.H 
ft. ftft 
ft.ft? 
ft.M 
ft.ftft 
ft.l* 
ft.ft* 
ft.ftft 
ft.ftft 
9.4* 
t.ftt 
ft.l* 
ft.ftft 
ft.ftft 
ft.ftft 
ft.ftft 
ft.ft* 
ft.ftft 
ft.ftl 
ft.S? 
ft.** 
ft.ft? 
9.4ft 
ft.S* 
ft.ftft 
ft.ft* 
ft. lft 
ft.tft 
ft.ftt 
ft.lt 
ft.ftft 
ft.ftft 
ft.tl 
ft.ftft 

i*ft£ 
ft.ftft 

ft.U 

tftftft 

1904*2 

• 94 
•.•ft 



mm 

ft. 9 ft.ftft 

14.1 7.94 

tf.f 7.90 

It t.ft* 

ft.ft t.H 



ftCUA 
U.ft 

Ift.ft 
ll.ft 

!♦.♦ 

14. * 
ftft. ♦ 

15. ♦ 
lft. 9 
U.ft 
Ift.ft 
U.l 
U.ft 
U.l 

lft.1 
ll.ft 
Ift.ft 

11.9 
U.t 
Ift.ft 
lft. 7 

U.7 
U.S 
It.* 
11*7 
U.ft 
11.1 
U.t 
U.l 
14.4 
11.7 
U.l 
U.7 
U.l 
U.f 
11.4 
IB .ft 
U.f 
U.l 
lft. 9 
UJ> 

U.ft 
l.S 



Pi 

ft.Tft 
ft.ft* 
ft.ftft 
ft.** 
ft.ft* 
ft.ft* 
ft.ft* 
*.%• 
ft.ftft 
ft.ftft 
9.19 
».♦? 
ft.ft? 
ft.ftft 
ft.Tft 
ft.*! 
9.71 
t.SS 

• .ftl 
ft.ftl 
ft.Tft 
ft.Tft 
ft.ftft 
ft.ftl 
ft.Tl 
ft.ft* 
ft.** 
t.ftt 
•.ftft 
ft.ftft 
ft.ft* 
ft. 7* 
ft.ftft 
ft.M 
ft.ft* 

• .44 
ft.ftt 
ft.ftft 
ft.ftl 

U4 
ft. ft* 
ft.U 



mm 



•MS 0tLf* 

•.ft* Ift.ft 

• .ft* U.ft 
ft.t? 11.1 
ft.** U.T 
ft.ftft It.* 
ft.** U.t 
ft.** 11.* 
ft.ftft U.ft 
ft.M 11.* 
•.ftft ll.ft 
9.0 lft.1 
ft.ft* 11.1 
ft.?* U.l 
ft .49 U.ft 
ft.ftt Ift.ft 

• .♦ft ft.ft 
ft.ftft U.ft 
ft.** U.ft 

ft.ftft ft.ft 

ft.ftft 9.* 

ft.U !•.♦ 

•.7ft Ift.ft 

•.♦1 U.l 

ft.ftft u.ft 

•.ftl Ift.ft 

ft.ftl U.l 

ft.Tft U.l 

ft.ftft 11.* 

ft.ft* U.ft 

ft.ftft If.* 

• .44 U.ft 

• .ft* Ift.ft 
ft.t? U.l 
ft.ftft U.ft 
9.44 U.ft 
ft.ftt ll.ft 
ft.ftft U.ft 
ft.U ll.ft 
ft.ftft 11.7 
ft.ftft 11.1 
ft.S? U.l 
ft.U 1.4 



15749 
tUTftftft 

ft.ftft 
ft.ftft 

U.ft 11.6ft 
fl.t *.** 
U.l ft.ft 
ft.ft 1.0ft 
ft.l I. ftft 



Mama iinuH 

9* ftftlft PtlTA 

ft.ftft ft.ftft 

9.M ft.4» 

ft.ftt ft.ft? 

ft.U ft.ftft 

ft.l? ft.ft* 

ft. lft ft.tft 

ft.tft ft.ftt 

ft.tft ft.l? 

ft. lft *.l? 

ft.t? ft.ftl 

ft.tft 9.14 

ft.ll ft.ftft 

ft. lft ft.ftft 

ft.H ft.ftl 

ft.ftft 0.41 

ft. 71 ft.ftft 

ft* ft.U 

,19 ft.47 

74 *.♦* 

ftft ft. 9ft 

ftft ft.*4 

ftf ft.ftft 

M ft.ftl 

4ft *.ST 

,44 ft.ftt 

.♦ft ft.ftft 

,1? ft.ftft 

ftft ft.ft* 

lft ft.ft? 

,4f 9.4* 

.4ft ft.ftft 

ft.ftft ft.ftft 

ft.ftl ft.t* 

0.41 ft.ftl 

9.4? 0.47 

ft.t? 9.17 

ft.tft ft.ftl 

ft. 14 ft.tft 

ft. 74 0.44 

1*11 *,49 

•.41 ft.tft 

ft.U ft.U 



U.ft 

Ift.ft 

lft.* 

14.9 
U.ft 
U.ft 
Ift.ft 
IS. 7 
Ift.ft 
15.5 
15.4 
U.ft 
14.5 
14.7 
ll.ft 
Ift.ft 
If .t 
U.l 
Ift.ft 
ll.ft 
U.ft 
ll.ft 
U.S 
ll.ft 
U.ft 

u.t 

14.1 
ll.ft 
U.f 

ll.ft 

U.S 
U.ft 
U.7 
ll.ft 
U.ft 
15.5 
Ift.ft 
U.ft 
Ift.ft 

UJ 
U.ft 

1.4 



107 
41193 

9.0ft 
ft.ftft 

ft.ft ft.U 

14.4 7.91 

ft.l 7.1ft 

t.ft t 54 

ft.l l.ftl 



Source: U.S. Department of Education, National Center for Education Statistics, National Education 
Longitudinal Study of 1988: Base Year Survey . 
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Appendix A-2— (continued) 
Item Analysis Statistics, Mathematics 



ataaat m» 



nw i 

IlfJt t 
ITBI 1 

titn % 

tun § 

net t 

iron 7 

xm» * 

ITEff 9 
ITBI It 
ITt* U 

not It 
m* is 

XTfH U 

no* is 
itch i* 

lit* 17 

net is 
situ it 

ITEM ft 
ITCB tl 
ITflt Iff 

ntff ts 

ITCH tt 
ITCH tt 

m n u 
mm t7 

XTfH ft 

mil tt 

ITffl St 
XTfH SI 

nctt if 
nut is 

XTfH St 

inn ss 

ITPI St 
XTfH ST 
XTD* St 

XTf» it 

ITEM tt 

couttt mm 

CSUftt t.s. 

SAfftf SXZf 
WVUTH* tSTXMATt 

comscifNT tim* 
spur mj at uasxuty 



foavu stest 

Mtet* SIGHT 
MHMB 

>*m» OUTS 



f .tt 
t.M 

t.tl 

t.it 
t.ts 
t.tt 
t.M 
o.xt 
t.M 
t.t* 
o.n 
t.st 
t.ts 
t.tt 
t.tt 

t.Tt 
t.tt 
t.tl 
t.TS 
• .ft 
t.tt 
t.tt 
t.tt 
t.tt 
t.tt 
t.tt 
t.M 
t.tt 

t.ts 

t.tt 
t.tt 
t.tt 

t.ts 
t.ts 

t.tt 
t.si 

t.St 

t.tl 

t.f7 
2A1 
t.tt 
t.IS 



t.tt 
t.M 
t.tl 
t.tt 
t.M 

t.sx 
t.tt 
t.lt 
t.M 
t.tt 
t.tt 
t.tt 
t.tt 
t.tl 

t.t7 
t.M 
t.tl 
t.M 
t.M 
t.tl 
t.M 
t.M 
t.tt 
t.tt 
t.M 
t.tt 
t.tt 
t.M 
t.tt 
t.tt 
t.M 
t.M 
t.SS 
t.tt 
t.tl 
t.tl 
t.7% 
t.f? 
t.M 

t!ts 
t.u 

ittt 



KIT* 

st.t 
st.t 

IS.t 

It.t 

IS. 7 
St.t 
IS.) 

ss.s 
It.t 
It.t 
It.t 
It.t 

11.7 
IS.t 

11. t 
It.t 

11. 7 
IS.S 

It.t 
It.t 

11.7 
It.t 
It.t 
IS. I 
It.t 

It.t 
it.t 

IS.t 
IS.7 
IS.t 
IS.l 

It.t 

IS. 7 
IS. 7 
IS.t 
It.t 
It.t 
IS.t 
IS.t 

lit! 
IS.t 

x.t 



t.t7 
t.tt 

u,» it.ts 

It. A t.tl 



P* MM Mm 

t.M t.tl If .1 

t.SS t.M St.t 

t.tS t.M IS. 7 

t.St t.M It.S 

t.M t.St It.t 

t.St t.St It.t 

t.M t.M U.S 

t.f7 t.M IS.t 

t.IS t.St It.S 

t.St t.tt IS. I 

t.M t.tt It.t 

t.St t.M St.t 

t.M t.M It.t 

t.St t.M It.t 

t.M t.tt 11. % 

t.1* t.M It.t 

t.M t.lt XI. 9 

t.M t.M It.t 

t.TS t.M It.t 

t.TS t.M t.f 

t.SX t.M St.t 

t.il t.tl 11.9 

t.M t.t7 It.t 

t.tl t.M It.t 

t.M t.M IS.t 

t.tl t.M It. 7 

t.M t.M 11,* 

t.tl t.M 1S.7 

t.M t.M IS.t 

t.M t.M IS.t 

t.M t.tl IS.t 

t.M t.tl It.t 

t.M t.M It.t 

t.M t.Sl IS.S 

t.tS t.St SS.7 

t.tt t.St U.t 

t.t7 t.SS It.S 

t.S7 t.tt It.t 

t.tt t.M IS.t 

0,19 f.SQ 16.5 

t.tS t.M 1S.7 

t.lt t.U l.S 

ISM 

mis? 



t.M 
t,tt 



ft. « 
l.S 

t.s 



7. At 
t «t 
l.vt 



IS.t 

17. S 

tx.t 

1.1 

t.s 



t.tt 

7.H 
7. IS 

t.tt 

l.SS 



ma mil 

P* MIS SILT A 

t.M t.M IS.t 

t.lt t.M It.t 

t.tl t.M IS.t 

t.lt t.M It.* 

t.SS t.M It.S 

t.St t.SS It.t 

t.t7 t.M IS.t 

t.tt t.M IS.t 

t.M t.St It.t 

t.M t.M IS. 7 

t.M t.St M.t 

t.tt t.tl U.t 

t.M t.M St.t 

t.M t.M St.t 

t.M t.M 11.7 

t.71 t.M 10. t 

t.M t.St lt.1 

t.S7 t.M It.t 

t.Tt t.M U.t 

0.17 t.tt 11. f 

t.sr t.M it.s 

t.M t.M IS.t 

t.M t.M It.t 

t.M t.Sl U.t 

t.M t.M IS.t 

t.tt t.M 11.1 

t.M t.M U.t 

t.St t.t7 U.7 

t.St t.M It.t 

t.St t.t7 lt.1 

t.M t.M IS.t 

t.M t.S? It.t 

S.S7 t.tl It.S 

8.M t.M It.t 

t.M t.M IS.t 

t.M t.M IS. 7 

t.U t.tl It.t 

t.SD t.tl It.t 

t.tS t.tt It.t 

ftoi fia* i&xf 

t.M t.M It.t 
t.IS t.U l.t 

IMS 
191*01 

t.M 
t.M 

t.S t.M 

U.t 7.10 

tt.f S.M 

l.S S « 

t.t t.M 



** WMt KL7A 

t.sr t.tl it.s 

t.M t.M IS.l 

t.t7 t.M U.S 

t.M t.M It.t 

t.M t.M It.S 

t.S7 t.M It.* 

t.M t.M *3.t 

t.tl t.M It.t 

t.M 0.M lt.9 

t.M t.tt IS.S 

t.tl t.SS It.S 

a.tt t.M it.t 

t.M t.M It.S 

t.M t»M lt.» 

t.M t.M U.S 

t.TS t.St It.S 

t.M t.SS 11. t 

t.Sf t.M lt.1 

t.77 S.M Xt.l 

t.Tt t.tl lt.l 

t.tS t.M 11.1 

t.M t.tl IS.S 

S.M t-M IS.t 

t.tt i.M U.S 

t.M t.tl IS.S 

t.tt t.M IS.l 

t.St t.M It.S 

t.M t.tt It.t 

t.St t.M U.l 

t.M t.Sl U.t 

t.St t.ST It.t 

t.il t.M It. 7 

t.ST t.tt It.S 

t.St t.SS lt.1 

t.tl t.Sl U.S 

t.tt t.M IS.t 

t.tS t.tT U.S 

t.M t.tt It.t 

t.tS t.tl IS.t 

0.17 It.t 

t.tl t.tt lt.0 

t.IS t.lt l.t 

Ittt 

19S7M 

O.M 
t.M 

mm ax 

t.O t.M 

It.t 7. OA 

tt.f 7.01 

1.1 t.lt 

t.S f.01 



* MIS MlTA 

t.Tt t.M U.S 

t.M t.M It.S 

t.M t.M U.t 

t.M t.tT It, 7 

t.M t.M U.t 

t.tT t.M U.S 

t.M t.tT U.S 

t.St t.M It.S 

t.M t.M U.S 

t.M t.tS IS.t 

t.tl t.SS St.t 

t.M t.SS U.S 

t.M t.Tt U.t 

t.M t.M U.t 

t.71 t.M It.S 

t.Sl t.M t.S 

t.TS t.M U.S 

S.M t.M U.t 

t.M t.tl t.t 

t.M t.M 9.9 

t.TS t.M t.t 

t.Tt t.TS It.t 

t.M t.tl 11.1 

t.tl t.M U.t 

t.TS t.M t.t 

t.tT t.61 U.t 

t.tT t.TS 11. t 

O.M t.M U.I 

J.M t.tT It.t 

Q.Sl t.M U.7 

t.*7 t.M 11.1 

t.Tt t.il ii. t 
t.u t.ts it.t 

t.SS t.tl It. 7 

t.M t.SS 11.7 

0.A7 S.M IS.l 

t.M t.tT U.I 

t.tt t.St U.t 

t.tt t.M IS.S 

SUM U2 l&tl 

t.M t.M lt.l 

t.U t.U l.t 

TSfS 
1019771 

t.M 

t.M 

lt.l 11.19 
IS.S S.M 
It.t s.so 

t.t 1.79 
S.l 1.19 



S* RftXS Kit* 

t.77 t.M U.S 

t.M t.ST It.S 

S.M S.f7 U.t 

t.Sl t.M 11.7 

S.M S.M It.t 

t.M t.%7 U.t 

S.M t.tt U.t 

t.tl t.M U.t 

t.tS S.M U.t 

O.M t.M U.t 

t.ST 6. ST U.S 

t.M t.t7 U.t 

t.M t.7t U.S 

t.M 0.70 It.S 

t.Tt S.M It.t 

t.M t.tl t.t 

t.71 t.M St.t 

t.M t.tT U.t 

t.ai t.M t.t 

t.M t.tl t.t 

O.Tt t.Sl lt.9 

t.TS t.TS 10. A 

t.tt t.tl 11.0 

t.M B.S1 11. t 

OM t.M U.t 

o.ts t.tt U.S 

t.M t.Tt U.t 

S.M t.S* U.S 

S.S7 O.M lt.l 

S.S7 t.Sl It.S 

t.ts e.M u.t 

t.TS t.tt 10.9 

t.tT t.tt 11.1 

0.S7 t.St lt.l 

t.M t.M It.t 

O.tt t.tl U.t 

t.Sl t.M lt.9 

O.tl t.tt 11.9 

t.M 0.77 U.l 

f.M O.M U.l 
t.IS t.U l.t 

TStl 
105*9** 

t.St 
t.M 

tSJH SJL 
17.9 20.90 
11.1 6. IS 
U.t S.lt 

O.A 1.69 

t.l t.M 
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Appendix a- 3 
Item Analysis Statistics, Science 



SO 



XTEH 1 
XTfH 2 
XTEH X 
XTEH * 
XTEH 5 
XTEH 6 
XTEH 7 
XTEH a 
XTEH 9 
XTEH XO 
XTEH SI 
XTEH It 
XTEH XX 
XTFH 14 
XTEH 15 
XTEH 16 
XTEH 17 
XTEH 10 
XTEH 19 
XTEH 29 
XTfH El 
XTEH 22 
XTEH 23 
XTEH 29 
XTEH ZS 
COLUMN HE AN 
COLIMI S.D. 

SAMPLE SIZE 
POPULATION ESTIMATE 

COEFFICIENT ALPHA 
SPLIT HALF RELIABILITY 



FORMULA SCORE 
NUMBER RIGHT 
NUMBER HR0N6 
NUMBER OMITS 
NUMBER NOT REACHED 



P* 

0.70 
0.79 
0.09 
0.67 
0.70 
0.70 
O.OS 
0.97 
0.09 
0.33 
0.90 
0.00 
0.72 
O.SS 
0.39 
0.90 
0.92 
O.OS 
0.92 
0.41 
0.42 
0.37 
0.39 
0.32 

0.03 
0.15 



T9THL 

RBXS 

0.S7 

0.51 

0.90 

0.95 

0.71 

0.07 

0.50 

0.90 

0.51 

0.53 

0.92 

0.50 

0.59 

0.05 

0.47 

0.42 

0.49 

0.54 

0.51 

0.39 

0.39 

0.30 

0.27 

0.90 

ft*32 
0.49 
0.10 



DELTA 
10.9 
9.0 
11.6 
XI. 3 
10.2 
10.2 
11.4 
12.3 
11.0 
12.7 
13.2 
11.3 
10.6 
12.7 
14.1 
13.4 
13.0 
13.5 
13.0 
13.9 
13.0 
14.3 
14.1 
14.0 

14*1 
12.6 
1.6 



23623 
2993973 

0.79 
0.77 

tt*M ML 

9.9 5.03 

13.3 4.52 

11.2 4.4S 

0.3 0.96 

0.1 0.90 



P* 
0.09 
0.00 
0.03 
0.63 
0.77 
0.70 
0.70 
0.61 
0.04 
0.54 
0.50 
0.70 
0.70 
0.50 
0.37 
0.46 
0.45 
0.49 
0.43 
0.41 
0.94 
0.39 
0.40 
0.33 

ItS 

0.16 



BUI 
RBXS 
0.60 
0.60 
0.49 
0.47 
0.70 
0.71 
0.50 
0.50 
0.52 
0.55 
0.46 
0.59 
0.59 
0.66 
0.47 
0.43 
0.53 
0.56 
0.52 
0.37 
0.42 
0.40 
0.39 
0.56 

ft*2S 
0.92 
0.11 



DELTA 
11.0 
9.6 
11.5 
11.6 
10.0 
10.2 
10.9 
11.9 
11.6 
12.6 
13.0 
10.9 
10.9 
12.2 
14.3 
13.0 
13.9 
13.1 
13.7 
13.9 
13.6 
14.6 
14.0 
14.7 

14*1 

12.3 
1.7 



11664 
1909300 

0 78 

0.79 

HE AH 9-D, 

10.2 6.10 

13.6 4.74 

11.0 4.67 

0.3 0.97 

0.1 1.05 



R» 
0.70 
0.77 
0.63 
0.70 
0.74 
0.70 
0.61 
0.54 
0.64 
0.93 
0.46 
0.62 
0.75 
0.49 
0.41 
0.46 
0.39 
0.41 
0.41 
0.41 
0.40 
0.39 
0.39 
0.32 

fttll 
0.53 
0.15 



rtmtt 



RBXS 
0.55 
0.41 
0.40 
0.45 
0.64 
0.62 
0.42 
0.42 
0.51 
0.31 
0.36 
0.54 
0.90 
0.64 
0.49 
0.41 
0.49 
0.92 
0.49 
0.33 
0.35 
0.37 
0.24 
0.95 

0.47 
0.10 



DELTA 

10.9 

10.1 

11.6 

10.9 

10.4 

10.2 

11.9 

12.6 

11.5 

12.7 

13.4 

11.7 

10.3 

13.1 

13.9 

13.4 

14.1 

13.9 

13.9 

13.9 

14.0 

14.1 

14.1 

14.9 

Iftxfi 

12.7 
1.6 



11763 
1405637 

0.72 
0.73 

vm 

9.6 5.62 

13.1 4.29 

11.5 4.26 

0.3 0.95 

0.1 0.91 



Source: 



U.S. Department of Education, National Center tor Education Statistics, National Education 
Longitudinal Study of 1988: Base Year Survey . 
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Appendix A-3— (continued) 
Item Analysis Statistics, Science 



TOTAL 







BBSS 


DELTA 


P» 


ITEM 1 


0 70 


0.57 


10.9 


0.66 




0 79 


0.52 


9. A 


0.61 


ITEM 3 


V . Of 


0 46 


11.6 


0 66 


ITEM 4 


0 67 

W • V w 


0.45 




0.66 




0. 76 


0.71 


10 £ 


0.76 


IT EN 6 


0.76 


0-67 


10.2 


0.76 


XTCH 7 


0.65 


0.50 


11.4 


0.70 


ITCH 6 


0-57 


0.46 


12.3 


0.53 


XTEtt 9 


0.64 


0.51 


11.6 


0.66 


1* 


0.53 


0.51 


12.7 


0.55 


II2H 11 


o.4a 


0.42 


23.2 


0.53 


I TEH 1? 


0.66 


0.56 


11,1 


0.79 


ITCH IS 


0.72 


0*54 


10.6 


0.77 


KCH 14 


0.53 


0,6 r 


22.7 


0.55 


ITCK IS 


0.39 


0.47 


14.1 


0.45 


JIEH 16 


0.46 


0.62 


13.4 


0.49 


ITCH IT 


0.42 


0.49 


13.0 


0.45 


ITEM 10 


0.45 


0.54 


13.5 


0.45 


2TEH 29 


0.42 


0.51 


13.6 


0.49 


ITCH 29 


0.42 


0.35 


13.9 


0.44 


IUH 21 


0.42 


0.39 


13.6 


0.47 


2 TEH 22 


0.37 


6.36 


16*3 


0.44 


tun 23 


0,39 


0.27 


14.2 


0.43 


ITEM 24 


0*32 


0.56 


14.6 


0.34 


ITCH 25 


SJtZ 




l&A 


0.24 


COlUTtf WAN 


0.53 


0.49 


12.6 


0.56 


coium s.o. 


q.is 


0,10 


1.6 


0.15 



SAMPLE SIZE 23623 

POPULATION ESTIMATE 2993973 

COEFFICIENT AIWA 0.75 

SPLIT HALE RELIABILITY 0.77 

PEAty S.P. 

FOPMULA 5C0»£ 0.9 5.61 

NLHBER SIGHT 13.3 4.52 

NUMBEft MP"*<G 11.2 4.46 

NUMBER OTiiTS 0.3 0.96 

NUMBER NOT REACHED 0.1 0.96 



JfcSIAN 
RBIS 
0*59 
0.55 
0.52 
0.42 
0.70 
0.69 
0.46 
0,52 
0.54 
0,56 
0.39 
0,61 
0,50 
0.67 
0.47 
0.47 
0.54 
0.55 
0.53 
0.45 
0.41 
0.39 
0.35 
0.56 
0 f 35 
0.51 
0.10 

2492 
105061 

0.77 
fr.76 



DELTA 
11.1 
9.5 
11.2 
11.5 
10.0 
10.1 
10.9 
12.7 
11.1 
12.4 
12.7 
10.9 
10.1 
12.5 
13.5 
13.2 
15.5 
13.5 
13.1 
13.6 
13.3 
13.6 
13.7 
14.6 

12.3 
1.6 



turn 

10.6 
24.0 
10.5 
0.3 
0.2 



6.05 
4.71 
4.67 
0.95 
1.25 



P* 
0.63 
0.72 
0.57 
0.62 
0.67 
0.65 
0.61 
0.46 
0.56 
0.41 
0.42 
0.57 
0.66 
0.36 
0.37 
0.43 
0.34 
0.34 
0.33 
0.36 
0.36 
0.33 
0.35 
0.24 

0.46 
0.15 



RBIS 
0.46 
0.49 
0.46 
0.36 
0.64 
0.60 
0.46 
0.46 
0.46 
0.46 
0.44 
0.54 
0.52 
0.53 
0.65 
0.31 
0.39 
0.41 
0.39 
0.26 
0.29 
0.31 
0.20 
0.53 
Lll 

0.43 
0.10 

2969 
302672 

0.67 
0.69 



DELTA 
21.6 
10,6 
12.3 
11.6 
11.2 
11.4 
11.9 
13,2 
12.4 
13.9 
13.6 
12.3 
11.3 
14.4 
14.3 
13,7 
24.7 
24.7 
14.7 
14.4 
14.4 
14.6 
14.5 
15.6 

liLd 

13.6 
2.6 



ma 

7.5 
11.5 
22.9 
0.4 
0.2 



5.19 
4.05 
4.07 
1.03 
1.29 



0.51 
0.69 
0.53 
0.57 
0*58 
0.65 
0.55 
0.46 
0.53 
0.43 
0.60 
0.50 
0.61 
0.25 
0.26 
0.39 
0.32 
0.30 
0.31 
0.36 
0.36 
0.29 
0.34 
0.20 

&J* 

0.42 
0.14 



JIMS- 
RBIS 
0.45 
0,44 
0.40 
0.40 
0.62 
0,56 
0.46 
9.39 
0.46 
0.39 
9.36 
0.*7 
0.50 
0.46 
0.43 
0.32 
0.30 
0.34 
0-45 
0.30 
0.27 
0.34 
0.25 
0.51 
$.32 
0.41 
0.09 

2649 
365339 

0.62 
0.65 



DELTA 

12.9 

11.0 

12.7 

12.3 

12.2 

11.5 

12.5 

13.2 

12.7 

13.7 

14.0 

12.6 

11.9 

15.7 

IS. 4 

14.1 

14.9 

15.1 

25.0 

24.4 

24.4 

25.2 

24.7 

16.4 

aix9 

13.6 
1,5 



ma 

6.3 
10.5 
13.7 
0.4 
0.3 



4.61 
3.76 
3.93 
1.10 
1.70 



0.75 
0.63 
0.67 
0.69 
0.60 
0.60 
0,66 
0.62 
0.66 
0.57 
0,50 
f,70 
0.75 
0.61 
0.41 
0.46 
0.45 
0.50 
0.46 
0.43 
0.44 
0.36 
0.41 
6.36 

0.57 
0.26 



MIL 
RBIS 
0.57 
0.49 
0.47 
0.45 
0.71 
0.67 
0.49 
0.45 
0.50 
0.5$ 
0.41 
0.55 
0.53 
0.65 
0.46 
0.44 
0.51 
0.55 
0.59 
0.36 
0.42 
0.36 
0.27 
0.54 

0.49 
0.10 



1576* 
2127441 

0.74 
0.76 

10.9 5.66 

14.2 4.39 

10.5 4.36 

0.3 0.69 

0.1 0.66 



mnw imw , 

DELTA MIS DELTA 

10.4 0.55 0.50 12.5 
9.4 0.65 0.57 11.4 

11.2 0.52 0.45 St. 6 

11.0 0.55 0.59 12.5 

9.6 0.62 0.69 11.6 

9.7 0.59 0.66 12.1 

11.1 0.54 0.55 12.6 
11.9 0.46 0.51 13.4 

11.2 0.49 0.49 15.1 

22.2 0.39 0.52 14.1 
13.0 0.35 0.39 14.6 
10.9 0.56 0.45 12.2 

10.3 0.60 0.62 12.0 
11.9 0.33 0.51 14.6 
13.9 0.27 0,49 15.4 
13.2 0.34 0.37 14.7 

13.5 0.32 0.35 14.6 
13.0 0.34 0,46 14.6 

13.5 0.26 0.47 15.4 
15.7 0.34 0.17 14,7 

13.6 0.36 0.21 14.3 

14.2 0.27 0.46 15.4 
15.9 0.41 0.29 13.9 

14.4 0.16 0.35 16.7 

12.3 0,42 0.46 13.9 
1.7 0.14 0.13 1.5 

307 
43163 



0.71 
0*72 



HjAN 

6.2 
10.4 
13.6 
0.5 
0,3 



5.43 
4.26 
4,25 
1.66 
1.53 



Source: U.S. Department of Education, National Center for Education Statistics, 
Longitudinal Study of 1988: Base Year Survey . 
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Appendix A-3— (continued) 
Item Analysis Statistics, Science 



HISPANIC HALE 



1 
2 
3 
6 
5 
6 

r 
a 

9 



ON 



ITEH 
ITEM 
ITEH 
ITEH 
ITEM 
ITEM 
ITEM 
ITEM 
ITEM 
ITEM 10 
ITEM II 
ITEM 12 
ITEH IS 
ITEM 1* 
ITEM 15 
ITEM It 
ITEM 17 
ITEM 19 
ITEM 19 
ITEM 20 
ITEM 21 
ITEM 22 
ITEM 23 
ITEM 24 
ITEM 25 
COllttl MEAN 
count S.D. 



0.62 
0.73 
0.59 
0*60 
0.69 
0.6* 
0.66 
0.52 
0.55 
0.43 
0.43 
0.61 
0.62 
0.40 
0.37 
0.41 
0.35 
0.35 
0.34 
0.37 
0.38 
0.31 
0.36 
0.26 

U2 

0.47 
0.15 



PBIS 
0.52 
0.54 
0.46 
0.37 
0.71 
0.63 
0.53 
0.4* 
0.50 
0.50 
0.4* 
0.56 
0.56 
0.58 
0.44 
0.35 
0.41 
0.44 
0.38 
0.32 
0.30 
0.2* 
0.24 
0.56 

!LM 
0.46 

oai 



DEITA 

11.7 

10.5 

12.1 

12.0 

11.0 

11.1 

11.3 

12.8 

12.5 

13.7 

13.7 

11.9 



.7 
.0 
.4 
.9 
.6 
.5 



11. 
14. 
14. 
13. 
14. 
14. 
14. 
14.4 
14.2 
13.0 
14.4 
15.6 

13.3 
1.6 



HISPANIC FEMALE 

P# RBIS DELTA 

0.64 0.45 11.5 

0.71 0.44 10.* 

0.56 0.47 12.4 

0.65 0.41 11.5 

0.65 0.56 11.4 

0.62 0.57 11.* 

0.56 0.45 12.4 

0.44 0.42 13.6 

0.57 0.46 12.3 

0.39 0.41 14.1 

0.40 0.39 14.0 

0.52 0.52 12.* 

0.69 0.50 11.0 

0.32 0.45 14.* 

0.37 0.46 14.3 

0.45 0.29 13.6 

0.33 0.36 14.* 

0.32 0.37 14.* 

0.32 0.40 14.9 

0.36 0.23 14.4 

0.34 0.26 14.6 

0.35 0.56 14.5 

0.34 0.15 14.6 

0.22 0.49 16.1 

LJM 5^1> 

0.45 0.41 IS. 5 

0.15 0.10 1.6 



BLACK HALE 

P* RBIS DELTA 

0.50 0.45 15.0 

0.69 0.49 11.0 

0.54 0.41 12.6 

0.54 0.42 12.6 

0.60 0.67 12.0 

0.64 0.61 11.6 

0.57 0.54 12.3 

0.51 0.44 12.9 

0.54 0.49 12.6 

0.43 0.37 13.7 

0.39 0.43 14.1 

0*56 0.4* 12.4 

0.56 0.50 12.4 

0.29 0.45 IS. 2 

0.27 0.43 IS. 4 

0.39 0.31 14.2 

0.32 0.34 14.9 

0.31 0.27 15.0 

0.31 0.45 15.0 

0.36 0.28 14.5 

0.36 0.32 14.5 

0.2 7 0.34 15.5 

0.34 0.25 J*. 7 

0.19 0.56 16.5 

£*1* SJ9 liJ 

0.42 0-42 S3.* 

0.14 0.10 1.6 



PLAps ffrtAtE 

P* RBIS DELTA 

0.52 0.47 12.6 

0.70 0.38 10.9 

0.52 0.3* 12.* 

0.59 0.39 12.1 

0.56 0.5* 12.4 

0.65 0.56 11.4 

0.53 0.37 12.7 

0.46 0.33 13.4 

0.52 0.43 12.8 

0.42 0.41 13. S 

0.42 0.29 13.* 

0.4* 0.46 13.2 

0.66 0.49 11.3 

0.21 0.51 16.2 

0.28 0.44 15.3 

0.40 0.33 14.0 

0.31 0.25 15.0 

0.2* 0.42 15.3 

0.30 0.45 15.0 

0.37 0.31 14.4 

0.37 0.22 14.4 

0.32 0.34 14.9 

0.34 0.23 14,7 

0.20 0.45 16.3 

Lull IfeJ 

0*42 0.39 13.* 

0.14 0.09 1.6 



WITE *Hf 

P* RBIS DELTA 

9.74 0.60 10.4 

0.84 0.60 9.1 

0.67 0.49 11.2 

0.66 0.4* 11.3 

0.62 0.79 9.4 

0.*0 0.73 9.7 

0.73 0.5* 10.6 

0.64 0.49 11.5 

0.67 0.50 11.2 

0.5* 0.56 12.2 

0.53 0.45 12.7 

0.74 0.58 10.4 

0.73 0.5* 10.5 

0.66 0.64 11.3 

0.39 9.47 14.1 

0.49 0.45 13.1 

0.4* 0.54 13.2 

9.54 0.5* 12.6 

0.46 0.52 13.4 

0.43 0.39 13.7 

0.46 0.44 13.4 

0.37 0.41 14.4 

0.41 0.30 13.9 

0.37 0.53 14.3 

IlU 4x14 IfuS 

0.58 0.52 12.1 

0.16 0.11 1.* 



mm nmi 

P* RBIS DELTA 

0.75 0.54 10.3 

0.79 0.39 9.* 

0.67 0.46 11.2 

0.73 0.44 10.6 

0.79 0.63 9.* 

0.*0 0.61 9*6 

0.63 0.41 11.7 

0.57 0.41 12.3 

0.6* 0.50 11.1 

0.57 0.50 12.3 

0.47 0.36 1 3.3 

0.67 0.52 11.3 

0.77 0.49 10.0 

0.56 0.63 12.4 

0.44 0.47 13.6 

0.4* 0.43 13.2 

0.42 0.4* 13.* 

0.45 0.52 13.5 

0.45 0.6* 13.6 

0.42 0.34 13.* 

0.42 0.3* 13.* 

0.41 0.37 14.0 

0.40 0.24 14.0 

0.36 0.54 14.5 

0,25 JL3* 15,* 

0.56 0.46 12*4 

0.16 0.09 1.7 



SAMPLE SUC 
POPULATION ESTIMATE 



1431 
150344 



1537 
150327 



1375 
168257 



1455 
194547 



7*27 
106042) 



7820 
1054444 



COEFFICIENT ALPHA 
SPLIT HALF RELIABILITY 



foanuu scope 

NUHBER RIGHT 
NUMBER WRONG 
NUMBER OHITS 
NUMBER NOT REACHED 



0.71 
0.73 



7.* 
11.* 
12.7 
0.4 
0.2 



5.48 
4.26 
4.24 
0.97 
1.21 



0.62 
0.64 



BUM 
7.2 
11.3 
13.1 
0.4 
0.2 



4.86 
3.80 
3.87 
1.08 
1.33 



0.63 
0.68 



HUM 

6.3 
10.5 
13.7 
0.5 
0.4 



4.99 
3.91 
4.08 
1.12 
1.69 



0.5* 
0.62 



6.3 
10.5 
13.* 
0.4 
0.3 



4.61 
3.60 
3.76 
1.09 
1.49 



0.77 
0.79 



em 

11.3 
14.4 
10.2 
0.3 
0.1 



5.94 
4.59 
4.55 
0.92 
0.6* 



0.70 
0.72 



tfA» 
10.6 
13.9 
10.* 
0.3 
0.1 



s,p t 

5.3* 
4.16 
4.14 
0.86 
0.66 



Source: U.S. Department of Education, National Center for Education Statistics, National Education 
Longitudinal Study of 1988: Base Year Survey . 
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Appendix A- 4 

Item Analysis Statistics, H1story/C1t1zensh1p/Geography 

TPm Bfttf BBftU. 







mis 


HILT A 


A ^ 

F9 


RBIS 


DELTA 


F* 


OBIS 


0ELTA 


*itn a 


A Aft 

9.09 


A a a 


¥•7 


A M 

9* 79 


0.50 


9.7 


0.00 


0.50 


9.6 


TTFM 9 

nut c 


ft 17 

9 . fr 


ft Ai. 

0.66 


* A M 

19.9 


A W 

9.77 


ft AM 

9.69 


v A V 

10.1 


0.70 


9*62 


9.9 


ITCH 1 


A OA 

V • 99 


ft ii 
9* 76 


7.9 


ft AA 


A <tA 

9.79 


0.2 


0.91 


0.73 


7.6 




A AA 

9.99 




it i 


A 9ft 

9. 79 


A AY 

9.67 


V A A 

10.9 


0.67 


0.59 


11.3 


TTFM K 


A AA 

9.06 


A AA 
9 •DO 


9*7 


ft Of 

9.97 


A AA 

9.64 


A A 

0.5 


0.05 


0.6O 


0.0 




A AA 


A AA 
9.99 


A 1 


ft Al 

9 .9 J 


ft AA 

9.99 


A A 

9.2 


0.04 


0.53 


0.9 


Aim * 


A Ol 

V. TA 


A AC 


7.7 


A AA 

9. ¥9 


A AA 

0.96 


7.0 


0.91 


0.05 


7.6 


TTFN A 


A AA 


A 99 

9* 79 


ft * 

9* 9 


ft A A 

9.99 


ft Tf 

9.73 


0.2 


0.00 


0.72 


0.3 


TTTM O 


A At 

9.91 


A AC 


7.6 


A ni 

9*91 


A AA 

9.99 


7.6 


0.91 


0.06 


7.5 


a ten iv 


A Tft 
if • f 9 


A A 9 


11 A 

11.9 


A 9ft 

9. 79 


A Kl 

0.51 


10.9 


0.70 


0.44 


10.9 


Attn A A 


A AA 


A it 

9.99 


1 A 1 

A< . A 


ft AS 

9.6* 


m AA 

0.66 


11.7 


0.55 


0.59 


12.5 


A I En AC 


A KK 


A C9 
9.9* 


19 C 


ft C9 
9.9A 


A 26 

9.54 


V A A 

12.0 


0.50 


0.51 


12.2 


if en 11 


A KA 
9 .99 


A AA 

0.99 


1 ft A 

12.* 


ft it 

9*61 


0.63 


11.9 


0*55 


0.53 


12.5 


litn a* 


A A9 


A At 

9.4 A 


13.0 




0.43 


13.6 


0.40 


0.40 


14.0 


A 1 Cffl A9 


A A9 

9.97 


ft AA 
9.59 


13.3 


J* A A 

9.40 


0.62 


13.2 


0.46 


0.55 


13.4 


A 1 All AP 


ft 4K 
9**9 


ft AC 

9.49 


13.5 


9.46 


9.50 


13.4 


0.44 


0.40 


15.7 


A Ten 17 


A 11 

9.93 


ft X.A 

9.64 


A V 

9.1 


A A4 

9*8% 


A « m 

0.60 


0.0 


0.03 


0.60 


9.2 


TTFM t A 
AlCff AO 


A ^A 

9.79 


A A A 

0.59 


A A 

9.9 


0.76 


0-61 


0.9 


0.70 


0.54 


9.9 


f < 1 ft 

lit, ■ It 


ft T8A 


ft 9A 

9.73 


* A A 

19.1 


0. 74 


0.7? 


10.4 


0.79 


0.69 


9.0 


TTCM 9ft 

A 1 tn 29 


A AA 

9 .fro 


ft Aft 

9.99 


« • A 

11.9 


0.66 


0.62 


11.3 


0.65 


0.50 


11.5 


1TFM 91 

Aicn A A 


A AA 
9.99 


ft AO 


It A 

11.9 


0.73 


A A A 

9.66 


V A A 

10.5 


0.59 


0.54 


12.1 


irfu ft ft 

Aitn zz 


0.40 


A r ^ 
9.99 


13*2 


0.40 


0.50 


13.2 


0.40 


0.53 


13.2 


ITCH 23 


fi 40 


A AA 

V.9v 


1 « 9 


ft AA 


ft CO 

9.5Z 


19 9 
A3. A 


ft AO 

9.47 


A Aft 

9.45 


13.3 


I TEH 24 


0.54 


0.54 


12.6 


0.54 


0.50 


12.0 


0.54 


0.49 


12.6 


ITEN 25 


0.47 


0.46 


13.3 


0.46 


0.45 


15.4 


0.40 


0.40 


13.2 


itch 26 


0.49 


0.52 


13.1 


0.51 


0.54 


12.0 


0.46 


0.49 


13.4 


ITCH 27 


0.51 


0.40 


12.9 


0.52 


0.63 


12.0 


0.51 


0.50 


12.9 


I TtH 2d 


0.45 


0.44 


13.7 


0.47 


0.49 


13.3 


0.39 


0.43 


14.1 


ITCH 29 


0.35 


0.55 


14.5 


0.33 


0.32 


14.5 


0.35 


0.30 


14.5 


ITCH 30 




0.20 


1SJ 


£a16 


0.86 


14a1 


&a19 


suts 


14. Q 


couto* HI AH 


0.63 


0.50 


11.4 


0.64 


0.60 


11.3 


0.63 


0.50 


11.5 


COUtO* S.O. 


0.10 


0.13 


2.2 


0.16 


0.13 


2.1 


0.19 


0.13 


2.3 



SAMPLE SIZE 
POPULATION ESTIMATE 

COEFFICIENT ALFriA 
SPLIT HALF RELIABILITY 



FORMULA SCORE 
NLtCER RIGHT 
MtCER MONO 
NUWER onus 
NUMBER NOT REACHED 



23536 
2904503 

0.03 
0.04 

15.1 7.64 

10.9 5.53 

10.0 5.41 

0.2 0.92 

0.1 0.09 



11600 
1404333 

0.05 
0.06 



HE AN 
15.4 
19.2 
10.5 
0.2 
0.1 



S.O. 
7.91 
5.75 
5.60 
0.07 
0.09 



11753 
1401344 

0.02 
0.02 

HE AN S-ffi 

16.0 7.33 

10.7 5.29 

11.0 5.20 

0.2 0.95 

0.1 0.91 



Source: 



U.S. Department of Education, National Center for Education Statistics, 
Longitudinal Study of 1988: Base Year Survey . 
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Appendix A-4--(continued) 



Item Analysis Statistics, History/Citizenship/Geography 



torn 







P* 


WIS 


DELTA 


P* 


ITEM 


I 


9*80 


0.56 


9.7 


0.64 


ITEM 


2 


0.77 


0.66 


10.0 


0.75 


ITEM 


3 


9*99 


0.76 


7.9 


0.90 


ITEM 


4 


0.60 


0.63 


11.1 


0.63 


ITEM 


5 


0.66 


0.66 


6.7 


0.66 


ITEM 


6 


0.64 


0*54 


9.1 


0.65 


ITEM 


7 


0.91 


0.65 


7.7 


0.89 


ITEM 


0 


0*66 


0.73 


8.3 


0.87 


ITEM 


9 


0.91 


0.65 


7.6 


0.89 


ITEM 


10 


0.70 


0.47 


11.0 


0.70 


ITEM 


11 


0.59 


0.63 


12.1 


0.62 


ITEM 


22 


0.55 


0.52 


12.5 


0.64 


ITEM 


13 


0.56 


0.56 


12.2 


0.59 


ITEM 


1* 


0.42 


0.41 


13.6 


0.56 


ITEM 


IS 


0.47 


0.59 


13.3 


0.55 


ITEM 


16 


0.45 


0.45 


13.5 


0.54 


ITEM 


17 


0.63 


0.64 


9.1 


0.81 


ITEM 


Id 


0.76 


0.59 


9.9 


0.80 


ITEM 


19 


0.76 


0.71 


10.1 


0.62 


ITEM 


20 


0.66 


0.60 


11.4 


0.65 


ITEM 


21 


0.66 


0.59 


11.4 


0.76 


ITEM 


^^ 


0.46 


0.56 


13.2 


0.57 


ITEM 


23 


0.46 


0.46 


13.2 


0.52 


ITEM 


24 


0.54 


0.54 


12.6 


0.56 


ITEM 


25 


0.47 


0.46 


13.3 


0.52 


ITEM 


26 


0.49 


0.52 


13.1 


0.50 


ITEM 


27 


0.51 


0.60 


12.9 


0.56 


ITEM 


26 


4.43 


0.46 


13.7 


0.45 


ITEM 


29 


P . 35 


0.35 


14.5 


0.40 


ITEM 


30 


9,25 


&lIS 


15.8 


0.29 


COLUTtl MEAN 


0.63 


0,56 


11.4 


0.67 


com*! s.d. 


0.16 


0.13 


2.2 


0.16 



A5IAH 



SAMPLE SIZE 
POPULATION ESTIMATE 

COEFFICIENT ALPHA 
SPLIT HALF RELIABILITY 



FORWLA SCORE 
NUrBER RIGHT 
NUMBER WRONG 
NUMBER OMITS 
NUMBER NOT REACHED 



23536 
2964563 

0.63 
0.64 



15.1 
18.9 
10.6 
0.2 
0.1 



7.64 
5.53 
5.41 
0.92 
0.69 



RfilS 
0.57 
0.72 
0.80 
0.62 
0.72 
0.64 
0.95 
0.60 
0.93 
0.56 
0.66 
0.52 
0.63 
0.52 
0.59 
0.46 
0.69 
0.61 
0.76 
0.65 
9.65 
0.54 
0.50 
0.52 
0.45 
0.*6 
0.6i 
0.52 
0.42 
0.34 
0.62 
0.14 

1465 
104503 

0.66 
0.67 



DELTA 
9.0 

10.3 
7.6 

11.6 
6.6 
6.9 
8.0 
8.4 
6.1 

10.9 

11.7 

11.5 

12.1 

12.3 

12.7 

12.6 
9.5 
9.6 
9.3 

11.4 

10.1 

12.3 

12.8 

12.4 

12.8 

13.0 

12.2 

13.5 

14.0 

15.2 

11.1 
1.9 



HISPANIC 



MBSL 



ME*N 
16.3 
19.9 
9.8 
0.2 
0.1 



1LJL. 

8.10 
5.83 
5.67 
1.07 
0.93 



p* 


RBXS 


DELTA 


Pt 


0.74 


0.54 


19.4 


0.66 


0.64 


9.60 


11.6 


0.73 


0.64 


0.66 


9.0 


0.62 


0.50 


0.54 


13.0 


0.54 


0.8O 


0.62 


9.7 


0.79 


0.75 


0.54 


19.3 


0.76 


9.82 


0.62 


9.3 


0.63 


0.79 


0.70 


9.7 


0.63 


0.81 


0.81 


9.5 


0.64 


0.67 


0.41 


11.2 


0.62 


0.46 


6.53 


13.2 


0.45 


0.47 


0.46 


13.3 


0.46 


0.52 


0.51 


12.6 


0.50 


0.49 


0.43 


13.1 


0.35 


0.40 


0.50 


14.1 


0.33 


0.36 


0.42 


14.2 


0.36 


0.73 


0.65 


19.5 


0.63 


0.70 


0.53 


10.9 


0.68 


0.70 


0.63 


19.9 


0.63 


0.54 


0.53 


12.6 


0.52 


0.57 


0.51 


12.3 


0.48 


0.44 


0.44 


13.6 


0.34 




ft &c 


1 J.O 




0.47 


0.47 


13.3 


0.45 


0.40 


0.39 


14.0 


0.40 


0.37 


0.41 


14.3 


0.32 


0.41 


0.53 


13.9 


0.38 


0.35 


0.33 


14.5 


0.31 


0.31 


0.29 


15.0 


0.32 


0.23 


auz 


15^9 


ffttC 


0.56 


0.51 


12.3 


0.54 


0.17 


0.14 


1.9 


0.19 




2981 







301603 

0.61 
0.62 



ma 

11.9 
16.7 
12.6 
0.3 
0.2 



7.64 
5.46 
5.33 
1.27 
1.40 



RB1S 
0.47 
0.56 
0.66 
0.57 
0.56 
0.53 
0.78 
0.67 
0.77 
0.38 
0.44 
0.46 
0.47 
0.34 
0.49 
0.31 
0.61 
0.50 
0.65 
0.46 
0.47 
0.42 
0.39 
0.49 
0.41 
0.31 
0.47 
0.32 
0.26 
0.05 
0.48 
0.15 

2845 
384751 

0.76 
0.77 



DELTA 

11.3 

10.5 
9.4 

12.6 
9.6 
9.9 
9.2 
9.2 
9.0 

11.6 

13.5 

13.5 

13,0 

14.6 

14.7 

14.5 
9.2 

11.1 

11.6 

12.8 

13.2 

14.7 

14.0 

13.5 

14.1 

14.8 

14.2 

14.9 

14.9 

16,1 

12.5 
2.1 



nm 

11.2 
16.1 
13.4 
0.3 
0.2 



6.90 
4.93 
4.86 
1.02 
1.36 



P* 
0.63 
0.61 
9.92 
0.74 
0.69 
0.86 
0.94 
0.91 
0.94 
0.72 
0.63 
0.57 
0.66 
0.41 
0.50 
0.47 
0.66 
0.61 
0.80 
0.70 
0.70 
0.51 
0.59 
0.57 
0.49 
0.54 
0.55 
0.46 
0.36 

4^5 
0.66 
0.19 



-"UTS 



RBIS 
0.56 
0.67 
0.79 
0.62 
0.67 
0.50 
0.86 
0.72 
0.67 
0.49 
0.65 
0.53 
0.60 
0.43 
0.59 
0.47 
0.64 
0.59 
9.76 
0.60 
0.59 
0.58 
0.50 
0.54 
0.47 
0.53 
0.61 
0.46 
0.37 

IxU 
0.59 
0.13 



DELTA 
9.2 
9.5 
7.3 
10.4 
6.2 
8.7 
6.6 
7.7 
6.6 
19.7 
11.7 
12.3 
11.9 
13.9 
13.0 
13.3 
6.7 
9.5 
9.7 
10.6 
19.9 
12.9 
13.0 
12.3 
13.1 
12.6 
12.5 
13.4 
14.4 

asj 

11.0 
2.4 



p# 

0.69 
0.65 
0.62 
0.55 
0.7S 
0.79 
0.79 
0.79 
0.76 
0.62 
0.44 
0.44 
0.47 
0.33 
0.36 
0.36 
0.69 
0.62 
0.62 
0.50 
0.54 
0.38 
0.36 
0.42 
0.37 
0.35 
9.37 
0.33 
0.32 

kOS 
0.52 
0.16 



Source: U.S. Department of Education, National Center for Education Statistics, 
Longitudinal Study of 1988: Base Year Survey . 



15694 
2120516 

0.63 
0.64 

Mf AN S.D. 

16. t 7.31 

19.6 5.32 

9.9 5.23 

0.2 0,76 

0.1 0.66 

National Education 



RBIS 
0.45 
0.56 
6.73 
0.59 
0.54 
9.62 
6.79 
9.67 
0.87 
0.42 
0.49 
0.34 
0.52 
0.32 
0.41 
0.21 
0.61 
0.65 
0.66 
0.61 
0.46 
0.37 
0.41 
0.47 
0.32 
0.41 
0.41 
9.33 
0.22 
0.0? 
0.49 
0.17 

308 
43293 

0.79 
0.76 



DELTA 
11.1 
11.5 
9.4 
12.5 
10.3 
9.7 
9.6 
9.7 
9.9 
11.6 
13.6 
13.6 
13.3 
14.6 
14.5 
14.2 
11.1 
11.6 
11.6 
13.0 
12.6 
14.3 
14.4 
13.9 
14.3 
14.6 
14.3 
;4.6 
14.6 

12.7 
1.9 



10.5 
15.7 
13.9 
0.4 
0.1 



7.40 
5.20 
5.17 
1.42 
0.82 
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Appendix A-4-- (continued) 



Item Analysis Statistics, H1story/C1t1zensh1p/Geography 



i 
z 
3 
4 
5 
6 

7 

a 

9 



itch 
itch 

ITCH 
ITEM 
ITCH 
ITCH 
ITCH 
ITCH 
ITCH 
ITCH 10 
ITCH II 
ITCH 12 
ITCH 13 
ITEH I* 
ITEH 15 
ITCH 16 
ITCH 17 
ITCH 20 
ITCH 
ITCH 
ITEH k. 
ITEH 22 
ITEH 23 
ITCH 2* 
ITCH 25 
ITCH 26 
ITCH 27 
ITCH 24 
ITEH 29 
ITEH SO 
COLltti HE AN 
COLUMN S.D. 



!»♦ 
0.73 
0*64 
0.02 
0*52 
0.79 
0*71 
0*62 
9.60 
0.61 
0.70 
0.53 
0.44 
0.55 
0.49 
0.40 
0*30 
0.75 
0.71 
0.70 
0.56 
0.67 
0.44 
0.44 
0.46 
0.41 
0.40 
0.42 
0.39 
0.33 
0.24 
0.57 
0.17 



R6I5 
0.55 
0*64 
0.70 
0.57 
0.63 
0.55 
0.64 
0*72 
9*63 
0.44 
0*56 
0*46 
0.56 
0.42 
0.55 
0.46 
0.69 
0.57 
0.7" 
0.** 
0.59 
0.47 
0.51 
0.46 
0.36 
0.44 
0.52 
0.37 
0.24 
0.16 
0.54 
0.15 



DELTA 
10*5 
11.6 
9*3 
12.6 
9*7 
10.6 
9.4 
9.6 
9.5 
10.9 
12.7 
13.6 
12.5 
13. 1 
14.0 
14.2 
10.3 
10.6 
10.9 
12.4 
11*2 
13*6 
13.6 
13.2 
13.9 
14.0 
13.6 
14.2 
14.6 
15,6 
12.2 
1.6 



HXSPWTC fEftHC .. BUSS ttHE black fcwle 

P* RBIS DELTA RBIS DELTA P* RBIS DELTA 

0.74 0.53 10.4 0.67 0.45 11.2 0.66 0.49 11.4 

0.64 0.55 11.6 0.72 0.61 10.7 0.75 0.56 10.3 

0.66 0.62 0.7 0.69 0.70 9.7 0.64 0.63 9.0 

0.47 0.50 13.3 0.55 0.61 12.5 0.53 0.53 12.7 

0.60 0.61 9.6 0.60 0.54 9.6 0.77 0.56 10.0 

0.76 0.55 9.9 9.77 0.53 10.0 0.70 0.54 9.9 

0.63 0.79 9.2 0.64 0.77 9.1 0.62 0.76 9.3 

9.76 0.67 9.9 0.63 9.60 9.1 9.62 0.67 9.3 

0.61 0.79 9.5 0.64 0.75 9.0 0.64 0.70 9.1 

0.64 0.39 11.5 0.61 0.36 11.9 0.63 0.39 11.6 

0.42 0.49 13.6 0.46 0.46 13.2 0.43 0.40 13.7 

0.51 9.40 12.9 0.42 0.43 13.6 0.49 0.46 13.1 

0.49 0.43 13.1 0.51 0.49 12.9 0.49 0.46 13.1 

0.49 0.44 13.1 0.36 0.36 14.4 0.33 0.33 14.7 

0.39 0.45 14.1 0.34 0.49 14.7 0.33 0.40 14.6 

0.37 0.35 14.3 0.36 9.35 14,2 0.33 0.27 14.7 

0.71 0.61 10.7 0.62 0.66 9.4 0.64 0.54 9.0 

0.69 0.40 11.0 0.66 0.54 11.1 0.69 0.45 11.1 

0.69 0.56 11.0 0.56 0.67 12.2 0.69 0.63 11.1 

0.52 0.53 12. f 0.50 0.50 13.0 0.53 0.46 12.7 

0.46 0.44 13.2 0.55 0.50 12.5 0.42 0.46 13.6 

0.43 0.41 13.7 0.34 0.49 14.7 0.34 0.44 14.7 

0.44 0.36 13.6 0.36 0.39 1*.3 0.43 0.39 13.7 

0.46 0.46 13.4 0.42 0.50 13.6 0.48 0.46 13.2 

0.40 0.42 14.0 0.36 0.36 14.2 0.41 0.44 13.9 

0.34 0.37 14.6 0.33 0.34 14.7 0.32 0.29 14.9 

0.40 0.53 14.1 0.16 0.51 14.4 0.41 0.45 13.9 

0.32 0.27 14.9 0.34 0.33 14.6 0.29 0.32 15.2 

0.29 0.33 15.2 0.33 0.19 14.6 0.31 0.33 15. 0 

0.22 0.17 16.0 0.23 0.01 13.9 0.£\ 0.10 16.3 

0.^5 0.49 12.4 0.54 0.46 12.5 0.54 0.47 12.5 

0.16 0.13 2.0 0.19 O.U 2.1 0.20 0.14 2.2 



, mm mv% 

P* RBIS DELTA 

0.63 0.59 9.2 

0.60 0.70 9.6 

0.91 9.62 7.6 

0.76 0.66 10.1 

0.69 9.64 6.0 

0.66 0.51 6.7 

0.93 0.66 7.0 

0.91 0.73 7.7 

0.94 0.66 6.7 

0.72 0.54 10.7 

0.67 0.69 11.2 

0.54 0.55 12.6 

0.64 0.65 11.6 

0.44 0.44 13.6 

0.51 0*63 12.9 

0.46 0.51 13.2 

0.66 0.66 0.6 

0.61 0.61 9.5 

0.77 0.60 10.0 

0.71 0.62 10.7 

0.76 0.67 10.0 

0.51 0.61 12.9 

0.50 0.53 13.0 

0.57 0.59 12.3 

0.40 0.47 13.2 

0.57 0.55 12.3 

0.57 0.64 12.3 

0.51 0.50 12.9 

0.36 0.36 14.4 

0»27 0.30 Ift^ 

0.67 0.61 10.9 

0.16 0.13 2.3 



mn fmit — 

F* RMS DELTA 

0.64 0.57 9.1 

0.91 0.03 9.5 

0.94 9.76 6.9 

0.73 0.57 10.6 

0.66 0.69 6.3 

0.06 9.59 0.6 

0.94 0.65 6.6 

0.91 0.70 7.7 

0.95 0.67 6.5 

0.72 0.44 10.7 

0.59 9.62 12.1 

0.60 0.51 12.0 

0.57 0.55 12.3 

0.39 9.43 14.1 

0.50 0.56 13.0 

0.46 0.41 13.4 

0.65 0.60 6.9 

0.01 0.56 9.4 

0.62 0.71 9.3 

0.69 0.56 11.0 

0.63 0.54 11.7 

0.51 0.55 12 9 

0.49 0.46 13.1 

0.56 0.49 12.4 

O.Si 0.46 12.9 

0.51 0.51 12.9 

0.54 0.59 12-* 

0.42 0.45 13.6 

0.36 0.39 14.4 

0.23 16, y 

0.65 0.59 11.1 

0.20 0.12 2.5 



SAHPLC 5I2E 
POPULATION ESTIMATE 



1426 
150025 



1532 
149579 



1372 
167645 



1454 
194371 



7765 
1056913 



7797 
1051076 



COEFFICIENT ALPHA 
SPLIT HALF RELIABILITY 



0.63 
0.63 



0.76 
0.60 



0.77 
0.77 



0.76 
0.77 



0.64 
0.66 



0-61 
0.62 



FORMULA SCORE 
NUMBER RIGHT 
NUTOER WONG 
NUMBER OMITS 
NUMBER HOT REACHED 



ma 

12.3 
17.0 
12.5 
0.3 
0.1 



6.03 
5.72 
5.56 
1.16 
1.16 



11.5 
16.3 
13.1 
0.3 
0.2 



7.21 
5.16 
5.07 
1.34 
1.59 



11.2 
16.1 
13.4 
0.3 
0.2 



6.94 
5.01 
4.91 
1.09 
1.49 



11.2 
16.1 
13.4 
0.3 
0.2 



6.66 
4.65 
4.60 
0.96 
1.23 



7C 



Source: U.S. Department of Education, National Center for Education Statistics, 
Longitudinal Study of 1988: Base Year Sjrvey . 



UTAH 
26.7 
20.1 
9.7 
0,2 
0.1 



2JL 

7.S9 
5.W 
5.4S 
0.72 
0.67 



16.0 6.99 

19.6 S.07 

10.2 5.02 

0.2 0.82 

0.1 0.66 



National Education 
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Appendix B-l 
Differential Item Functioning (DIF), Reading 



HANTEL-HAENSZEL ODD5-*ATIO AND OTHEI? STATISTICS, NUMBER OF TABLES = 21 

NO. LEVELS LEVEL 1 



LEVEL 2 



CTOUP VARIABLE: RACE 
RESPONSE VARIABLE: ITEHSCOR 
STRATIFYING VARIABLE: * RIGHT 



2 
2 
22 



WHITE 
RIGHT 



f REFERENCE) 



ASIAN 
WRONG 



(FOCAL) 



MM ODDS 
RATIO 



m CHI- 
SQUARE 



PROS > 
CHI-SQ 



MM 
D-DIF 



5TD ERR 
MH D-OIF 



STDZO 
O-DIF 



STO ERR 
STO D-DIF 



REFERENCE 
N P* N0» 



N 



FOCAL 

NO« 



IMPACT 



ON 



ITEM 
ITEM 
ITEM 
ITEM 
ITEM 

Item 

ITEM 
ITEM 
ITEM 
ITEM 10 
ITEM II 
ITEM 12 
ITEM 13 
ITEM 14 
ITEM IS 
ITEM 16 
ITEM 17 
ITEM Id 
ITEM 19 
ITEM 20 
ITEM 21 



0*62 
1.24 
1.28 
1.34 
1.33 
1.02 
1.06 
0.66 
0.82 
0.66 
0.75 
1,25 
0.90 
0.85 
0.90 
1.01 
0.96 
0.93 
1.06 
1.02 
1.17 



1.53 
5.82 
6.51 
20.50 
17.29 
0.06 
0.63 
5.29 
9.20 
6.32 
16.60 
6.62 
2.68 
536 
2.38 
0.00 
0.33 
1 .44 
0.70 
0.04 
5.19 



0.22 
0.02 
0.00 
.00 
.00 
.80 
36 
.02 
.00 



0. 
0. 
0. 
0. 
0. 
0. 



0.01 
0.00 
0.00 
0.09 
0.02 
0.12 
0.95 
0.57 
0.23 
0.40 
0.64 
0.02 



0.47 
-0.51 
-0.57 
-0.69 
-0.66 
-0.04 
-0.15 
0.36 
0.47 
0.36 
0.67 
-0.52 
0.25 
0.38 
0.25 
-0.02 
0.09 
0.18 
-0.14 
-0.04 
-0.37 



A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 



0.36 
0.21 
0.19 
0.15 
0.16 
0.16 
0.15 
0.15 
0.16 
0.14 
0.16 
0.18 
0.15 
0.16 
0.16 
0.19 
0.16 
0.15 
0.16 
0.17 
0.16 



0.39 
-0.41 
-0.44 
-0.50 
-0.45 
-0.03 
-0.10 
0.25 
0.39 
0.31 
0.50 
-0.35 
0.19 
0.25 
0.16 
-0.01 
0.07 
0.15 
-0.10 
-0.03 
-0.27 



0.32 
0.18 
0.17 
0.13 
0.13 
0.15 
0.13 
0.13 
0.14 
0.13 
0.13 
0.14 
0.13 
0.13 
0.13 
0.16 
0.13 
0.13 
0.14 
0.15 
0.13 



15730 
15724 
15722 
15696 
15657 
15730 
15714 
15701 
15140 
15073 
15670 
15675 
15628 
15605 
15616 
1*564 
15521 
15*30 
15416 
15380 
15346 



0.96 
0.69 
0.86 
0,65 
0.61 
0.67 
0.47 
0.55 
0.68 
0.44 
0.64 
0.7? 
0.56 
0.54 
0.52 
0.62 
0.60 
0.56 
0.69 
0.76 
0.66 



639 
639 
639 
647 
647 
647 
647 
694 
645 
686 
646 

646 
639 
645 
645 
645 
639 
645 
645 
639 



i495 
1494 
1494 
1494 
1465 
1493 
1493 
1494 
1442 
1429 
1487 
i486 
1464 
1470 
1479 
1470 
1469 
1463 
1446 
1446 
1444 



0.96 
0.66 
0.82 
0.58 
0.55 
0.65 
0.45 
0.57 
0,70 
0.47 
0.67 
0.73 
0.57 
0.55 
0.53 
0.80 
0.60 
0.59 
0.67 
0.74 
0.65 



66 
66 
66 
69 
69 
69 
68 
70 
66 
69 
66 
68 
68 
66 
68 
68 
68 
66 
66 
66 
66 



0.00 
0.03 
0.04 
0.07 
0.06 
0.02 
0.02 
-0.01 
-0.03 
-0.03 
-0.03 
0.04 
-0.01 
-0.02 
-0.01 
0.01 
0.01 
-O.Ol 
0.02 
0.01 
0.04 



Source: 



U.S. Department of Education, National Center for Education Statistics, 
Longitudinal Study of 1988: Base Year Survey . 



National Education 
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Appendix B-l— (continued) 
Differential Item Functioning (OIF), Reading 



HAMTEL-HAENSZEL OODS-RATIO AND OTHER STATISTICS* NUTOER Of TABLES = ZI 

NO. LEVELS LEVEL 1 'SVEL 2 



GROUP VARIABLE: RACE 2 WtlTE I REFERENCE ) HISPANIC (FOCAL! 

RESPONSE VARIABLE: ITEHSCOR 2 RIGHT WRONG 

STRATIFYING VARIABLE: i RIGHT 22 







m ODDS 


MH CHI- 


PROS > 






5TD CRR 


ST0ZO 


STO CRR 


REFERENCE 




FOCAL 










RATIO 


SQUAPE 


CHI-SQ 


O-DIF 


m D-DIF 


D-DIF 


SID D-DIF 


N 


P* 


N0« 


N 


P* 


N0« 


IHPAC1 


ITCH 


2 


0.75 


9 73 


0.00 


0.69 


A 


0.22 


0.57 


0.20 


15730 


0.96 


639 


2994 


0.94 


33 


0.02 


ITEM 


e 


1.06 


0.61 


0.37 


-0.13 


A 


0.24 


-0.11 


0.12 


15724 


0.89 


639 


2966 


0.60 


33 


0.08 


ITCH 


3 


1.04 


0.49 


0.48 


-0.10 


A 


0.23 


-0.09 


0.21 


25722 


0.86 


639 


2900 


0.76 


33 


0.10 


2TEH 


* 


1.12 


5.23 


0.02 


-0.26 


A 


0.22 


-0,21 


0.09 


25696 


0.65 


647 


2979 


0.47 


45 


0.16 


ITCH 


5 


1 16 


9.32 


0,00 


-0.35 


A 


0.11 


-0.28 


0.20 


25657 


0.61 


647 


2965 


0*43 


43 


0.29 


ITCH 


6 


l.oa 


2.43 


0.12 


-0.17 


A 


0.11 


-0.24 


0.09 


25730 


0.67 


639 


2993 


0.50 


33 


0.26 


ITEH 


7 


2.24 


7.35 


0.01 


-0.32 


A 


0.12 


-0.25 


0.20 


25724 


0.47 


639 


2985 


0.30 


33 


0.17 


ITCH 


a 


2.06 


1.52 


0.22 


-0.14 


A 


0.11 


-0.22 


0.2O 


25702 


0.55 


647 


2990 


0.38 


45 


0.16 


ITCH 


9 


0,65 


12.42 


0.00 


0.39 


A 


0.11 


0.32 


0.20 


25240 


0.68 


645 


2829 


0.59 


40 


0.09 


ITCH 


10 


092 


3.56 


0.06 


0.21 


A 


0.11 


0.18 


O.IO 


15073 


0.44 


644 


2817 


0.36 


40 


0.08 


ITCH 


II 


0.75 




0.00 


0.68 


A 


0.11 


0.56 


0.10 


25670 


0.64 


639 


2952 


0.54 


33 


0.09 


ITCH 


12 


1.09 


2.94 


0.09 


-0.21 


A 


0.12 


-0.14 


0.10 


25675 


0*78 


646 


2952 


0.62 


43 


0.16 


ITFH 


15 


0.93 


2.32 


0.13 


0.16 


A 


0.11 


0.15 


0.10 


25628 


0.56 


646 


2931 


0.44 


44 


0.22 


ITCh 


1* 


0.99 


0 03 


0.86 


0.02 


A 


0.11 


0.03 


0.10 


15605 


0.54 


639 


2928 


0.38 


33 


0.26 


ITCH 


IS 


0.86 


9.56 


0.00 


0.37 


A 


0.12 


0*26 


0.10 


15616 


0.52 


645 


2915 


0.38 


43 


0.25 


ITCH 


16 


2.02 


0.09 


0.76 


-0.04 


A 


0.13 


-0.02 


0.10 


15564 


0.82 


679 


2899 


0.68 


33 


0.23 


ITCH 


17 


1.1* 


7.42 


0.01 


-0.30 


A 


0.11 


-0.21 


0.10 


25521 


0.60 


645 


2684 


0.42 


42 


0.19 


ITCH 


id 


0.64 


15.37 


0.00 


0.42 


A 


o.n 


0.35 


0.10 


15480 


0.58 


629 


2874 


0.49 


33 


0.09 


ITCH 


19 


1.11 


4.47 


0.03 


-0.24 


A 


0.11 


-0.18 


0.10 


15416 


0.69 


639 


2631 


0.53 


33 


0.17 


ITCH 


20 


0.95 


2.16 


0.28 


0.13 


A 


0.12 


0.10 


0*10 


25380 


0.76 


645 


2622 


0.64 


43 


0.22 


ITCH 


22 


2.0? 


3.54 


0.06 


-0.22 


A 


0.11 


-0.16 


0.10 


15348 


0.60 


639 


2608 


6.53 


33 


0.26 



Source: U.S. Department of Education, National Center for Education Statistics, National Education 
Longitudinal Study of 1988: Base Year Survey . 



Appendix B-l--(contlnued) 
Differential Item Functioning (DIP), Reading 



MAMTEL-HAENSZCL OODS-RATIO AND OTHER STATISTICS* NUMBER OF TABUS = Zl 

NO. LEVELS LEVEL 1 



LEVEL 2 



GROUP VARIABLE- PACE 
RESfWSE VARIABLE 7 ITEMS COR 

STRATI? TINS VARIABLE* i RIGHT 



2 
Z 

zz 



WHITE 
RIGHT 



I REFERENCE ) 



BLACK 
WRONG 



I FOCAL I 



HH ODDS 
RATIO 



mm CHI- 
SQUARE 



PR OB > 
CHI-SQ 



MM 
D-DIF 



STB ERR 
m D-OIF 



STOZO 
O-BIF 



STB ERR 
STD D-DIF 



REFERENCE 
N P* NO* 



N 



FOCAL 
*»♦ HO* 



IMPACT 



ITEM 


1 


0.70 


15,33 


0.00 


0.^5 


A 


0.22 


I TEH 


Z 


1.23 


13.06 


0.00 


-0.49 


A 


0.14 


ITEM 


3 


0.06 


0.5S 


0.45 


0.10 


A 


0.13 


ITEM 


* 


1.39 


46.67 


0.00 


•0.76 


A 


0.12 


ITEM 


5 


0.7? 


?6.t6 


0.00 


0.60 


A 


0.12 


ITEM 


6 


1.15 


a. 63 


0.00 


-0.34 


A 


0.11 


ITEM 


7 


1.09 


2.97 


0.09 


-0.21 


A 


0.12 


ITEM 


a 


0.92 


2.90 


0.09 


0.20 


A 


0.12 


ITEM 


9 


0.78 


?5.0S 


0.00 


0.56 


A 


0.12 


ITEM 


10 


0.S5 


11.61 


0.00 


0.39 


A 


0.11 


ITEM 


xi 


0.64 


12.30 


0.00 


0.40 


A 


0.11 


ITEM 


iz 


1.29 


25.15 


0.00 


-0.61 


A 


0.12 


ITEM 


ii 


1.02 


0,20 


0.65 


-0.05 


A 


0.11 


ITEM 


14 


0.76 


£5.9* 


0.00 


0.59 


A 


0.12 


ITEM 


15 


0.69 


46.65 


0.00 


0.07 


A 


0.12 


ITEM 


16 


0.86 


7.52 


0.01 


0.36 


A 


0.13 


ITEM 


17 


0.97 


0.44 


0.51 


0.06 


A 


0.12 


ITEM 


IS 


0.82 


17.27 


0.00 


0.47 


A 


0.11 


ITEM 


19 


1.26 


20.53 


0.00 


-0.54 


A 


0.12 


ITEM 


20 


1.09 


0.52 


0.47 


-0.09 


A 


0.12 


ITEM 


21 


1.10 


4.06 


0.04 


-0.23 


A 


0.12 



0.75 


0.20 


15730 


0.% 


639 


2654 


C.93 


21 


0.01 


-0.39 


0.12 


15724 


0.B9 


639 


2642 


0.76 


21 


0.13 


0.09 


0.11 


15722 


0.66 


639 


2643 


0.75 


21 


0.12 


-0.60 


0.10 


15696 


0.65 


647 


2637 


0.40 


30 


0.2S 


0.44 


0.10 


156*7 


0.61 


647 


2B17 


0.47 


30 


0.15 


-0.26 


0.10 


1573* 


0.67 


639 


2645 


0.46 


21 


0.21 


-0.17 


0.11 


15714 


0.47 


647 


2632 


0.26 


30 


0.19 


0*14 


0.10 


15701 


0.55 


647 


2632 


0.37 


29 


0.1B 


0.46 


0.10 


15140 


0.6B 


645 


2630 


0.57 


26 


0.10 


0.36 


0.11 


15073 


fc.44 


649 


2614 


0.36 


26 


0.09 


0.32 


0.10 


15670 


0.64 


639 


2605 


0.46 


21 


0.15 


-0.40 


0.10 


15675 


0.76 


646 


2605 


0*55 


29 


0.23 


-0.0! 


0.10 


15626 


0.56 


646 


2807 


0.40 


29 


0.16 


0.47 


0.10 


15605 


0.54 


639 


2771 


0.39 


21 


0.14 


0. 9 


0.10 


15616 


0.52 


645 


2730 


0.36 


27 


0.14 


0.26 


0.11 


15564 


0.62 


645 


2701 


0.68 


12 


0.14 


0.09 


0.10 


15521 


0.60 


639 


2669 


0.42 


21 


0.16 


0.37 


0.10 


15460 


0.56 


639 


2642 


0.',/ 


Zl 


0.13 


-0.41 


0.10 


15416 


0.69 


645 


2574 


0.47 


25 


0.22 


-0.06 


0.10 


15360 


0.76 


645 


2567 


0.59 


25 


0*16 


-0.17 


0.10 


15546 


0.6B 


639 


25*4 


0.50 


21 


0.18 



Source: U.S. Department of Education, National Center for Education Statistics. National Education 
Longitudinal Study o 1988: Base Year Survey . 



Appendix B-l--(continuec;) 
Differential Item Functioning (DIF), Reading 



MANTEL -HAEN5ZEL ODDS-RATIO A»JD OTHER STATISTICS, NUMBER Of TABLES = 21 



NO. LEVELS 
2 



LEVEL 1 



LEVEL 2 



GROUP VARIABLE: RACE 
RESPONSE VARIABLE: ITEKSCDR 
STRATIFYING VARIABLE: i RIGHT 



Z 



WHITE 
RIGHT 



(REFERENCE I 



AH IND 
WRONG 



1 FOCAL I 



ITEM 
ITEH 
HEM 
ITCH 
ITEH 
ITEM 
ITEM 
ITEM 
ITEM 
ITEM 10 
ITEM 11 
ITEM 12 
ITEM 13 
ITEM 1* 
ITEM IS 
ITEM 16 
ITEM 1? 
ITEM IB 
ITEM 19 
ITEM 20 
HEM 21 



rw ODDS 
RATIO 



MH CHI- 
SWARE 



PROS > 
CHISQ 



MH 
D-DIf 



STD ERR 
MN D-DIF 



STOZO 
D-01F 



STD ERR 
STD D-DIF 



REFERENCE 
H P» NO* 



N 



FOCAL 
Pf NO* 



IMPACT 



0.38 


11 .82 


0. 


00 


2.29 


C 


0.68 


2.05 


0.65 


15730 


0.96 


639 


307 


0.95 


2 


0.00 


1.38 


4.86 


0. 


03 


-0.77 


A 


0.3* 


-0.62 


0.31 


1572* 


0.89 


6*7 


306 


0.73 


* 


0.16 


o.98 


0.00 


0. 


97 


0.0* 


A 


0.36 


0.02 


0.31 


15722 


0.86 


639 


306 


0.73 




0.13 


0.88 


0. 76 


0. 


38 


0.30 


A 


0.32 


0.23 


0.27 


15696 


0.65 


6*7 


306 


0.*7 


* 


0.16 


1.1* 


076 


0. 


38 


-0.31 


A 


0.32 


-0.2* 


0.26 


15657 


0.61 


6*7 


30* 


0.39 


* 


0.23 


101 


0.00 


0. 


98 


-0.03 


A 


031 


-0.03 


0.27 


15730 


0.67 


6*7 


307 


*.*7 


* 


0.19 


1.05 


0.09 


0. 


77 


-0.12 


A 


0.3* 


-0.10 


0.30 


1571* 


0.*7 


6*7 


305 


0.28 


* 


0.20 


1.09 


0.28 


0. 


60 


-0.21 


A 


0.3* 


-0.16 


0.29 


15701 


0.55 


69* 


305 


0*33 


7 


0.12 


o.n 


0.42 


0. 


52 


0.22 


A 


0.31 


0.19 


0.29 


151*0 


0.68 


6*5 


261 


0.5* 


* 


0.1* 


0.6* 


1.20 


0 


27 


0.37 


A 


0.32 


0.33 


0.30 


15073 


0.** 


606 


279 


0.35 


7 


0.10 


0.7* 


S.20 


0. 


02 


0.7* 


A 


0.31 


0.57 


0.27 


15670 


0.6* 


6*6 


301 


0.50 


* 


0.1* 


1.10 


0.3' 


0. 


56 


-0.22 


A 


0.3* 


-0.15 


0.28 


15675 


0.78 


6*6 


303 


0.56 


* 


0.21 


1 . OS 


0.30 


0. 


59 


-0.18 


A 


0. 30 


-0.16 


0.28 


15628 


0.56 


6*6 


302 


0.37 


* 


0.19 


0.97 


0.0* 


0. 


85 


0.00 


A 


0.32 


0,06 


0.29 


15605 


0.5* 


639 


303 


0.3* 


2 


0.19 


0.79 


2. 35 


0. 


13 


0.5* 


A 


0.3* 


0.*0 


0.29 


15616 


0.52 


6*5 


298 


0.35 


* 


0.17 


1.06 


0.09 


0. 


76 


-0.13 


A 


0.35 


-0.09 


0.29 


1556* 


0.82 


6*5 


297 


0.63 


* 


0.19 


0,90 


0.65 


0. 


<*2 


0.26 


A 


0,31 


0.23 


0.28 


15521 


O.60 


6*5 


295 


0.*2 


3 


0.18 


1.15 


0.99 


0. 


32 


-0.3* 


A 


0.31 


-0.29 


0.29 


15*80 


0.58 


639 


295 


0.39 


2 


0.19 


1.20 


1.53 


0. 


21 


-0.*2 


A 


0.32 


-0.32 


0.28 


15*16 


0.69 


6*5 


297 


0.*6 


* 


0.23 


0,99 


0.00 


0. 


97 


0.03 


A 


0.32 


0.03 


0.28 


15380 


0*76 


6*5 


295 


0.58 


* 


0.17 


2.23 


2. 34 


0. 


13 


-0.*9 


A 


0.31 


-O.*0 


0.28 


153*8 


0.68 


639 


295 


0.*6 


2 


0.22 



Source. 



U.S. Department of Education, National Center for Education Statistics 
Longitudinal Study of 1988: Base Year Su-'«?y . 



National Education 



Appendix B-l--(continued) 
Differential Item Functioning (DIF), Reading 



HANTEL-HAENSZEL ODDS-RATIO AND OTHER STATISTICS* NUMBER OF TABLES = 21 

NO. LEVELS LEVEL I LEVEL 2 



GROUP VARIABLE: SEX 2 HALE I REFERENCE) FEMALE < FOCAL) 

RESPONSE VARIABLE: ITEttSCOT 2 RIGHT WRONG 

STRATIFYING VARIABLE-* • RIGHT 22 







HH ODDS 


HH CHX- 


PROS > 


HH 




5TD ERR 
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Appendix B-2—(continued) 
Differential Item Functioning (DIF), Mathematics 
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Appendix B-2—(continued) 
Differential Item Functioning (DIF), Mathematics 
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Appendix B-2--(cont1nued) 
Differential Item Functioning (DIF), Mathematics 
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Source: U.S. Department of Education, National Center for Education Statistics, National Education 
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Appendix B-3 
Differential Item Functioning (DIF), Science 
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U.S. Department of Education, National Center for Education Statistics, National Education 
Longitudinal Study of 1988: Base Year Survey . 
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Differential Item Functioning (OIF), Science 
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Source: U.S. Department of Education, National Center for Education Statistics, National Education 
Longitudinal Study of 1988: Base Year Survey . 



Appendix 8-3— (continued) 
Differential Item Functioning (OIF), Science 
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15636 


0.69 


136 


2611 


0.56 


4 


0.13 


ITEM 


0 


0.09 


6.10 


0.01 


0.26 


A 


0.11 


0.25 


0.10 


15707 


0.61 


26 


2622 


0.49 


0 


0.12 


ITEM 


9 


0.88 


7.55 


0.01 


0.31 


A 


0.11 


0.27 


0.10 


15693 


0.66 


41 


2817 


0.55 


6 


0.13 


ITEM 


10 


0.91 


2.06 


0.15 


0.16 


A 


0.11 


0.15 


0.10 


15513 


0.60 


41 


2761 


0.44 


6 


0.16 


ITEM 


11 


0.55 


15.52 


0.00 


0.43 


A 


0.11 


0.39 


0.10 


15447 


0.52 


31 


2749 


0.43 


1 


0.09 


ITEM 


12 


1.05 


0*7 


0.32 


-0.12 


A 


0.12 


-0.08 


0.10 


14685 


0.75 


IK 


2699 


0.56 


4 


0.19 


ITEM 


13 


0.08 


5.65 


0.02 


0.29 


A 


0.12 


0.24 


0.11 


15397 


0.76 


26 


2691 


0.66 


0 


0.12 


ITEM 


14 


2.10 


271.47 


0.00 


-1.96 


C 


0.12 


1.59 


0.11 


25692 


0.63 


136 


2814 


0.27 


5 


0.36 


ITEM 


IS 


0.95 


2.31 


0.13 


0.16 


A 


0.12 


0.16 


0.11 


15552 


0.43 


31 


2741 


0.30 


1 


0.23 


inn 


16 


0.82 


18. *6 


0.00 


0.47 


A 


0.11 


0.43 


0.10 


15510 


0.50 


26 


2753 


0.41 


0 


0.09 


ITEM 


17 


0.96 


0.67 


0.41 


0.10 


A 


0.11 


0.12 


0.11 


15582 


0.47 


31 


2759 


0.32 


1 


0.15 


ITEM 


Id 


1.16 


11.15 


0.00 


-0.39 


A 


0.12 


-0.31 


0.11 


15526 


0.52 


31 


2741 


0.31 


2 


0.21 


ITEM 


19 


0.94 


1.70 


0.19 


0.15 


A 


0.11 


0.10 


0.11 


15561 


0.47 


39 


2750 


0.33 


8 


0.14 


ITEM 


20 


0.85 


11.43 


0.00 


0.37 


A 


0.11 


0.34 


0.11 


15545 


0.45 


31 


2722 


0.37 


2 


0.08 


ITEM 


21 


0.85 


15 66 


0.00 


0.43 


A 


0 11 


0.41 


0.11 


15537 


0.46 


26 


2719 


0.37 


0 


0.00 


ITEM 


22 


0.89 


6.04 


0.01 


0.28 


A 


0.11 


0.24 


0.11 


15443 


0.40 


31 


2695 


0.31 


2 


0.09 


ITEM 


23 


0.95 


1.12 


0.29 


0.12 


A 


0.11 


0.10 


0.11 


15102 


0.43 


31 


2651 


0.35 


2 


0.07 


ITEM 


24 


1.04 


0.37 


0.54 


-0.08 


A 


0.13 


-O.OB 


0.12 


15530 


0.36 


40 


2634 


0.21 


9 


0.17 


ITEM 


25 


1.00 


0.00 


1.00 


0.00 


A 


0.14 


-0.01 


0.14 


15470 


0.24 


26 


2676 


0.16 


0 


0.09 



Source: 



U.S. Department of Education, National Center for Education Statistics, National Ef- 'cation 
Longitudinal Study of 1988: Base Year Survey . 



Appendix (continued) 
Differential Item Functioning (DIF), Science 
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10(1 



HANTEL-NAENSZEL ODDS-PATIO AW OTHER STATISTICS? NUMBER OF TABLES ' 25 

HO. LEVELS LtVEL 1 



LEVEL Z 



GROUP VARIABLE: PACE 
RESPONSE VARIABLE^ ITEHSC0R 
STRATIFYING VARIABLE : • RIGHT 



2 
2 
26 



WHITE 
RIGHT 



(REFERENCE) 



AH I NO 
WRUNG 



(FOCAL) 







MH ODDS 


MH CHI- 


PROS > 


MH 


STO ERR 


STOZO 


STD ERR 


REFERENCE 




FOCAL 








RATIO 


SQUARE 


CHI-SQ 


DOIF 


W 0-OIF 


D-0IF 


STO D-DIF 


ri 




ND* 


N 


N0» 


3WAC1 


ITEM 


I 


1.20 


1.66 


0.17 


-0.43 A 


0.30 


-8.36 


6.26 




fl 7C 
U . 9 9 


IB/ 

586 


305 


0.55 1 


0.21 


ITEM 


2 


1.06 


0.21 


0.65 


•0.16 A 


0.33 


-0,15 


0.29 


15698 


0.82 


381 


102 


0.67 0 


0.16 


I f EM 


3 


0.91 


0.52 


0*7 


0.23 A 


6.30 


0.19 


0,26 


15636 


0.69 


386 


301 


0.55 1 


0.1* 


ITEM 


* 


1.06 


0.14 


0.71 


-0.13 A 


0.30 


-0.12 


6.26 


15677 


0.71 


395 


301 


0.55 5 


0.16 


ITEM 


5 


0.57 


0.72 


0.40 


0.33 A 


0.35 


0.22 


0.29 


15673 


0.61 


386 


302 


0.63 1 


0.16 


ITEM 


6 


1.03 


0.02 


0.90 


-O 07 A 


0.3* 


-0.05 


0.28 


156*9 


0.82 


386 


296 


0.62 1 


0.20 


ITEM 


7 


0.85 


1.43 


0.23 


0.40 A 


0*31 


0.30 


0.26 


15636 


6.69 


365 


300 


0.56 1 


0.13 


ITEM 


6 


0.99 


0.00 


0.98 


0.03 A 


0.30 


0*03 


0.28 


15707 


0.61 


386 


30* 


0.46 1 


0.15 


ITEM 


9 


1.06 


0.15 


0.70 


-0.13 A 


0.30 


-0,11 


0.28 


15693 


0.66 


386 


301 


0.50 1 


0.16 


ITEM 


10 


1.09 


0.33 


0.57 


-0.20 A 


0.31 


-0.16 


0.26 


15513 


O.60 


396 


299 


0.*1 3 


6.19 


ITEM 


11 


1.01 


0.00 


0.99 


-0.62 A 


0*30 


-O.Ol 


0.29 


15447 


0.52 


395 


29* 


0.38 3 


0.1* 


ITEM 


12 


0.60 


2.47 


0.22 


0.52 A 


0.32 


0.45 


0.29 


14385 


0.75 


1272 


285 


0.60 1 


0.1* 


ITEM 


13 


0.90 


0.48 


0.49 


0.26 A 


0.3* 


0.19 


0*29 


15397 


0.78 


380 


268 


0.6* 1 


0.1* 


ITEM 


1* 


1.5* 


9.79 


0.00 


-1.02 B 


0,33 


-0.82 


0.29 


15692 


0.63 


386 


297 


0.3* 1 


0.29 


ITEM 


15 


0.9* 


0.15 


0.70 


0.15 A 


0.33 


0.13 


0,30 


15552 


0.43 


394 


29* 


0.30 2 


0.13 


ITEM 


16 


0.99 


0.00 


0.98 


0.01 A 


0.30 


0,02 


0*29 


15510 


0.50 


386 


298 


0.36 1 


0.1* 


ITEM 


17 


0,90 


0.5* 


0.46 


0.25 A 


0.31 


0.23 


0.29 


15582 


0.47 


386 


300 


0.33 1 


0.13 


ITEM 


18 


1.00 


0.00 


0.96 


0 .00 A 


0.32 


-0.01 


0.29 


15526 


0.52 


386 


298 


0.3* 2 


0.18 


ITEM 


19 


1.07 


0.19 


0.67 


-0.16 A 


0.32 


-0.15 


0.30 


15581 


0.*7 


39* 


300 


0.30 * 


0.17 


ITEM 


20 


1.01 


0.00 


0.99 


-0.02 A 


0.30 


0.00 


0.29 


15545 


0**5 


39* 


297 


0.33 2 


0.11 


ITEM 


21 


0.68 


0.05 


0.36 


0.29 A 


0.30 


0.29 


0.29 


15537 


0.46 


369 


296 


0.36 2 


0.10 


ITEM 


2Z 


1.13 


0.68 


041 


-0.30 A 


0.33 


-0*29 


0.32 


15*43 


0.40 


385 


295 


0.26 1 


0.1* 


ITEM 


23 


0.77 


4.27 


0.04 


0.62 A 


0.29 


0.60 


0.29 


15182 


0.*3 


376 


*93 


0.40 1 


0.03 


ITEM 


24 


1.17 


0.67 


0.35 


-0.37 A 


0.i7 


-0.32 


0.35 


15530 


0.36 


395 


296 


6.20 3 


0.18 


ITEM 


25 


0.97 


0.01 


0,94 


0.06 A 


0,39 


0.05 


0.38 


15470 


0.2* 


379 


29* 


0.16 1 


0.08 



Source: U.S. Department of Education, National Center for Education Statistics, 
Longitudinal Study of 1988: Base Year Survey . 
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Appendix B-3-- (continued) 
Differential Item Functioning (0IF) # Science 



•? Q 4 JHANTE L-MAEMS2EL OOOS-RATIO AND OTHER STATISTICS, NUMBER OF TABLES = 25 

NO. LEVELS LEVEL I 



LEVEL 2 



GROUP VARIABLE: SEX 
RESPONSE VARIABLE: XTEMSCOR 
STRATIFYING VARIABLE * I RIGHT 



2 
Z 
26 



HALE 
RIGHT 



(REFERENCE) 



FEMALE 
WRONG 



i focal: 



HH OODS 
RATIO 



hh Chi- 

SQUARE 



PROS > 
CHI -SO 



HH 
0-DIF 



STD ERR 
HH O-DIF 



5TDZD 
O-DIF 



STD ERR 
STD 0-OIF 



REFERENCE 
N f>« NO* 



FOCAL 
P* NO* 



IMPACT 



00 



ITEM 
ITEM 
ITEM 
ITEM 
ITEM 
ITEM 
ITEM 
ITEM 
ITEM 
ITEM 10 
HEM II 
ITEM 12 
ITEM 13 
ITEM I<» 
ITEM 15 
ITEM 16 
ITEM 17 
ITEM 13 
ITEM 19 
ITEM 20 
ITEM 21 
ITEM 22 
ITEM 23 
ITEM 24 
ITEM 25 



086 
1.29 
0.97 
0.67 
1.19 

1.56 

1.27 
89 
00 
14 
56 
77 
39 



0.73 
0.87 
1.12 
1.29 
0.95 
0.93 
1,10 
0.74 
0.97 
0.<*3 
0. 79 



20.47 
53.34 
0.82 
173.63 
21.28 
3.08 
227.68 
71.68 
14.29 
0.00 
21.98 
173.12 
60. 73 
109. 78 
118. 3$ 
23.97 
14.60 
72. *8 
3.04 
6.26 
10. 32 
107.52 
0. 99 
5.03 
48.*7 



0.00 
0.00 
0.37 
0.00 
CO 
08 
00 
00 
00 
00 



0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0.00 
0,08 
0.01 
0.00 
0 00 
0.32 
0.02 
0.00 



0.34 
-0.61 

0.07 

0.93 
-0.41 

0.25 
-1.08 
-0.57 

0.27 

0.00 
-0.31 
-1.05 B 

0 62 A 
-0.77 

0.75 

0.33 
-0.26 
-0.59 

0.12 

0.17 
•0.22 

0.71 

0.07 

0,17 

0.55 



A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 
A 



0.08 
0.08 
0.07 
0-07 
0.09 
9.09 
0.07 
0.07 
0.07 
0.07 
0*07 
0.08 
0.08 
0.07 
0.07 
0.07 
0.07 
0.07 
0.07 
0.07 
0.07 
0.07 
0.07 
0.07 
0.08 



0.29 
-0.51 
0.05 
0.64 
-0,29 
0.11 
-0.91 
-0.49 
0.21 
0.00 
-0.26 
-0.S4 
0.52 
-0.57 
0.66 
0.29 
-0.21 
-0.47 
0.09 
0.14 
-0.18 
0.65 
0.06 
0.13 
0.51 



0.07 
0.07 
0.06 
0.07 
0.07 
0.07 
0.06 
0.06 
0.06 
0.06 
0.06 
0.07 
0.07 
0.06 
0.06 
0.06 
0.06 
0.06 
0.06 
0.06 
0.06 
0.06 
0.06 
0.06 
0.08 



11617 
11610 
1153d 
11580 
11563 
11550 
11553 
11628 
11609 
11441 
11370 
10997 
11209 
11569 
11431 
11446 
11488 
11429 
11448 
11413 
11406 
11365 
11218 
11401 
11329 



0.70 
0.62 
0.66 
0.65 
0.78 
77 
72 
61 
65 
57 
52 
75 



0.74 
0.60 
0.39 
0.46 
0.46 
0.51 
0.45 
0.43 
0.46 
0.36 
0.42 
0.35 
0.22 



34 
25 
34 
25 
129 
332 
124 
25 
34 
44 
25 
123 
25 
123 
33 
25 
33 
25 
25 
33 
25 
33 
25 
33 
25 



11737 
11739 
11666 
11709 
11699 
11682 
11677 
11714 
11715 
11610 
11544 
11175 
11563 
11706 
11573 
11536 
11583 
11565 
11625 
11564 
11572 
11449 
11232 
11504 
11486 



0.70 
0.77 
0*65 
0.7', 
0.75 
0.77 
0.61 
0.54 



0.65 
0.54 
0.«7 
0.66 
0.77 
0.50 
0.43 
0.46 
0.40 
0.42 
0.42 
0.42 
0.41 
0.40 
0.40 
0.33 
0.13 



12 

6 
12 

6 
47 
137 
46 

8 
12 
20 

8 
46 

9 
47 
12 

8 
12 

9 

9 
13 

9 
13 

9 
13 

9 



0.00 



05 
01 
06 
04 
01 
10 



0.07 
0.00 
0.03 
0.05 
0.09 

-0.03 
0.10 

-0.04 
0.00 
O.06 
0.09 
0.02 
0.01 
0.05 

-0.04 
0.01 
0.02 

-0.01 



Source: 



U.S. Department of Education, National Center for Education Statistics 
Longitudinal Study of 1988: Base Year Survey . 
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Appendix B-4 

Differential Item Functioning (DJF), History/CitUenshlp/Geography 



tUNIEl-HAEHSZEl ODDS -RAT 10 A»*D OTHER STATISTICS* NUMBER OF TABLES « 10 

MO. lEVCtS LEVEL 1 UVEL g 



GROUP VARIABLE- RACE 
RESPONSE VARIABLE: ITEHSCOR 
STRATIFTING VARIABLE ' • RIGHT 



2 
2 
31 



WHITE 
RIGHT 



f REFERENCE I 



ASIAN 
WOW 



t FOCAL ) 



MH TODS 
RATIO 



MH CHI- 
SQUARE 



PR TO > 
CHI-SQ 



MH 
D-0IF 



STD ERR 
MH 0-DIF 



STDJTO 

o-o;f 



STD ERR 
STO D-DIF 



REFERENCE 
N P* NO* 



FOCAL 

p* no* 



IMPACT 



00 
to 



104 



I 

2 
3 
4 
5 
6 
7 

8 

9 



ITEM 
ITEM 
ITEM 
ITEM 
ITIM 
ITEM 
ITEM 
ITEM 
ITEM 
ITEM 10 
ITEM 11 
ITEM 12 
ITEM 13 
ITEM 1* 
ITEM IS 
ITEM 26 
ITEM 17 
ITEM 18 
ITEM 19 
ITEM 20 
ITEM 21 
ITEM 22 
ITEM 23 
ITEM 2* 
ITEM 25 
ITEM 26 
ITEM 27 
ITEM 2B 
ITEM 29 
ITEM 30 



Source: 



0.87 
1.52 
1,2* 
2.00 
1.21 
1.14 
2.16 
1.49 
3.20 
2.01 
0.98 
0.66 
2.14 
0.47 
0,95 
0,76 
1,60 
1.05 
0.60 
1.43 
0,65 
0.7* 
0.96 
1.10 
0.95 



2* 
93 
12 
91 
91 



2.42 
28.58 
3.32 
i09.97 
3.68 
2.20 
37.24 
24.51 
89.63 
0.01 
0,06 
^3.15 
4.19 
156.2** 
0.64 
19.64 
33.42 
0.32 
29. 21 
29.13 
33.18 
25,68 
0.35 
2,43 
0.61 
12.79 
1,23 
3.02 
2-57 
1.7* 



0.12 
0.00 
0.07 
0.00 
0.06 
0.15 
0.00 
0,00 
0.00 
0.93 
0.81 
0.00 
0.04 
.00 
0.42 
0.00 
0.00 
0.57 
0.00 
0.00 
0.00 
0.00 
0.55 
0.12 
0.44 
0.00 
0.27 
0.08 
0,11 
0.19 



A 
A 
A 
A 



0.33 A 
-0.98 A 
-0.50 A 
-1.63 C 
-0.45 A 
-0.30 A 
-1.81 C 
-0.94 A 
-2.66 C 
-0.02 

0.04 

0.97 
-0.32 

1.77 C 

0.12 A 

0.64 A 
-1.11 B 
-0.11 A 

1.21 B 
-0.84 A 

1.01 B 

0.74 A 

0.09 A 
-0.23 A 

0.12 
-0.51 

017 
-0.26 

0.23 

0.21 



0.21 
0.28 
0.2B 
0.16 
0.23 
0.20 
0.30 
0.24 
0.29 
0.26 
0.16 
15 
15 
24 
15 



0.14 
0.20 
0 19 
0.23 
0.26 
0.18 
0,15 
0.14 
0.14 
0.24 
0,14 
0.25 
0.15 
0.24 
0.16 



0.28 
-0.70 
•0.37 
-1.23 
-0.37 
-0.26 
-1.22 
-0.70 
-1.67 
-0.01 
0.03 
0.81 
-0,24 
1.48 
0.09 
0.52 
-0.86 
-0.08 
0.79 
-0.65 
0.79 
0.57 
O.07 
-0.28 
0.09 
-0.41 
0,13 
-0.20 
0.19 
0,28 



0.29 
0.15 
0.23 
0.13 
0.20 
0.19 
0.23 
0.22 
0.22 
0.26 
0.23 
0.24 
0.23 
0.23 
0.23 
0.23 
0.17 
0,26 
0.28 
0.24 
0.26 
0,23 
0.23 
0.23 
0.13 
0.23 
0.23 
0.23 
0.13 
0.14 



15457 
15668 
15677 
25628 
15581 
15595 
15594 
25583 
15596 
25638 
25637 
25623 
25560 
25541 
25654 
25643 
25634 
25653 
25630 
15609 
15590 
15581 
25593 
15557 
25376 
25559 
25527 
25496 
25530 
15472 



0.85 
0.82 
0.93 
0.76 
0.90 
0.87 
0.95 
0.92 
0.95 
0.73 
0.65 
0.59 
0.63 
0.44 
0.52 
0.48 
0.87 
0.82 
0.81 
0.72 
0.72 
0.53 
0.51 
0.58 
0.52 
0.55 
0.57 
0.48 
0.38 
0.26 



208 
208 

2114 
208 
633 
218 

1966 
837 

2206 
208 
208 
208 
208 
240 
208 
208 
208 
208 
623 
208 
208 
208 
208 
208 
220 
221 
221 
221 
221 
222 



1463 
1483 
2460 
1477 
1474 
1471 
1470 
1468 
1471 
1477 
1474 
1470 
1465 
1471 
1483 
1481 
1473 
1480 
1475 
1480 
1474 
1475 
1469 
1472 
1452 
1467 
1460 
1450 
2459 
1454 



0.87 

0,77 

0.92 

0.66 

0.89 

0.86 

0.91 

0.89 

0.90 

0.74 

0.67 

0.68 

0.62 

0.61 

0.55 

0.56 

0.82 

0.82 

0.86 

0.67 

0.79 

0.61 

0.54 

0.57 

0.55 

0.53 

0.61 

0.48 

0.42 

0.29 



33 
33 

242 
33 
83 
33 

216 
99 

240 
33 
33 
33 
33 
42 
33 
33 
33 
33 
83 
33 
33 
33 
33 
33 
33 
33 
33 
33 
33 
33 



-0.02 
0.04 
0.01 
0.10 
0.02 
0.02 
0.04 
0.03 
0.05 
-0.02 
-0.02 
-0.09 
0.01 
-0.17 
-0.03 
-0.07 
0.05 
0.00 
-0.05 
0.05 
-0.07 
-0.08 
-0.03 
0.00 
-0.03 
0.02 
-0.03 
0.00 
-0.04 
-0.03 



U.S. Department of Education, National Center for Education Statu* 
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Differential Item Functioning (DIF), H1story/Citizensh1p/Geography 
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Differential Item Functioning (OIF), History/CU1zensh1p/Geography 

HANTEL-HAENSiEL COOS-RATIO AND OTHER STATISTICS. NUMBER OF TABLES = 30 
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Source: U.S. Department of Education, National Center for Education Statistics, National Education 
Longitudinal Study of 1988: Base Year Survey . 
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ITEM PARAMETERS FOR READING TEST 
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SOURCE: U.S. Department of Education, National Center for Education Statistics, National 
Education Longitudinal Study of 1988: Base Year Survey. 
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SOURCE: U.S. Department of Education, National Center for Education Statistics, National 
Education Longitudinal Study of 1988: Base Year Survey. 
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ITEM PARAMETERS FOR SCIENCE TEST 
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9845 




0. 


2824 




0.2069 






S.D 




0. 


3749 




0. 


9500 




0.1040 







SOURCE: U.S. Department of Education, National Center for Education Statistics, National 
Education Longitudinal Study of 1988: Base Year Survey. 
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C-4 



ITEM PARAMETERS FOR HISTORY/CITIZENSHIP/GEOGRAPHY TEST 

ITEM 



NUMBER 






S.E. 




B 


S.E 




c 


S.E 


TmpM 
X X Ln 


X 


1 • 


0496 


(0. 030) 


-0 . 


5444 


(0. 035) 


0 . 


4565 


(0.012) 


JL X HFl 


*Z 


0 • 


983 3 


(0. 021) 


-0. 


8964 


(0. 029) 


0. 


2195 


(0.012) 


X X EsPl 


-5 


x • 


6649 


(0- 044) 


-1 . 


3435 


(0. 025) 


0 . 


3644 


(0.013) 


x x c#n 


A 
H 


X • 


0102 


(0. J23) 


-0 . 


3776 


(0 . 024) 


0 . 


2367 


(0 . 010) 


TTFM 




1 • 


1296 


( 0. 031) 


•1 . 


0224 


(0. 038) 


0 . 


4635 


(P. 013) 


X 1 tn 


O 


0 • 


5205 


( 0 . 017 ) 


"1 ♦ 


6335 


(0, 094) 


0 . 


3680 


(0 . 023) 


TTPM 


7 


1 • 


513 3 


(0. 033) 


-1 . 


8517 


(0. 021) 


0 . 


0826 


(0.011) 


TTFM 


A 
O 


0 . 


9790 


(0. 022 ) 




7132 


(0. 036) 


0. 


2097 


(0. 016) 


TTFM 
X X 


Q 


1 * 


5849 


(0. 035) 


-l . 


8688 


( 0 . 020) 


0. 


0762 


(0.010) 


TTFM 

X X til 1 


X u 


1 * 


1069 


( 0 . 036 ) 


0 . 


2149 


(0 . 027) 


0 . 


4689 


(0 . 008) 


TTPM 

X X £*rl 


X X 


2 • 


JT\ ~i a a 

0744 


(0, 049) 


0 • 


1959 


(0.011) 


0. 


2964 


{0. 006) 


TTPM 

x x 


X <& 




7068 


( 0 . 020) 


0 . 


1729 


(0 . 030) 


0 . 


1911 


(0 . 010) 


TTPM 
X X 


X 0 


1 • 


A A *\ *S 

4423 


(0* 036) 


0 . 


2593 


(0 . 015) 


0 . 


3025 


(0 . 006) 


TTPM 


xft 


0 . 


9478 


( 0. 034 ) 


1 . 


0496 


(0. 021) 


0. 


2660 


(0 . 006) 




1 R 
1 D 


1 . 


314 5 


(0.031) 


0 • 


4760 


(0. 013) 


0. 


2020 


(0 . 006) 


TT I'M 


X O 


1 • 


5 454 


(0. 047} 


0 . 


8897 


(0.014) 


0. 


3017 


(0 . 005) 


TTPM 


1 7 
X / 


0 • 


8238 


(0. 018) 


-1 . 


4562 


(0* 039) 


0. 


1947 


(0 . 016) 


TTPM 


X 0 


U . 


9 3 7 0 


{ 0 . 025) 


-0 . 


6494 


(0. 036) 


0. 


3659 


(0 . 013) 


ITEM 


19 


1 . 


✓* f\ C 

6059 


(0. 034 ) 


-0, 


6313 


(0 . 017} 


0. 


257? 


(0 « 009) 


ITEM 


20 


0. 


8968 


(0.021) 


-0. 


2790 


(0.027) 


0. 


2226 


(0.010) 


TTPM 

X X Hii i 


O 1 

£ X 


1. 


1929 


(0.030) 


-0. 


0569 


(0.021) 


0. 


3294 


(0.008) 


X X C/Pl 




1 . 


4 767 


( 0. 037 ) 


0 . 


5534 


(0. 013 ) 


0. 


2538 


(0 . 005) 


ITEM 


23 


1. 


2290 


(0. 037) 


0. 


7582 


(0,016) 


0. 


2912 


(0 . 006) 


ITEM 


24 


0. 


7872 


(0.021) 


0. 


2554 


(0.025) 


0. 1891 


(0.009) 


ITEM 


25 


0. 


8587 


(0.028) 


0. 


7691 


(0.023) 


0. 


2539 


(0.008) 


ITEM 


26 


1. 


2166 


(0.033) 


0. 


6286 


(0.016) 


0. 


2620 


(0.006) 


ITEM 


27 


1. 


1746 


(0.027) 


0. 


2807 


(0.015) 


0. 


1878 


(0.007) 


ITEM 


28 


1. 


8998 


(0.055) 


0. 


8826 


(0.011) 


0. 


2814 


(0.004) 


ITEM 


29 


1. 


4052 


(0.053) 


1. 


3309 


(0.017) 


0. 


2611 


(0.004) 


ITEM 


30 


2. 


2371 


(0.089) 


1. 


5372 


(0. 013} 


0 . 


1902 


(0.003) 


MEAN 




1. 


2438 




-0. 


1357 




0. 


2682 




S.D 




0. 3974 




0. 


9715 




0. 


0941 





SOURCE: U.S. Department of Education, National Center for Education Statistics, National 
Education Longitudinal Study of 1988: Base Year Survey. 
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APPENDIX D 
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APPENDIX D 
Test Information Functions 



Appendix D presents the test information functions for the 8th Grade test forms. 
The test information functions can be interpreted as a plot of the reciprocal of the 
square of the standard error of measurement for all values of theta. In general, 
information functions of 1 .0 and higher are considered quite acceptable. Over 90% of 
the students' scores are in the theta range that meets this criterion on all four tests. The 
information functions for Science and History/Citizenship/Geography are less peaked 
and have broad band measurement properties. Reading and Mathematics are slightly 
more peaked, with the best measurement slightly above the mean. 
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APPENDIX D-1 



NELS:88 Grade 8 Reading Test 
21 Items 
Test Information Function 




Source: U.S. Department of Education, National Center for Education Statistics, 
National Education Longitudinal Study of 1988: Base Year Survey. 
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APPENDIX D-2 



NEL S:88 Grade 8 Mathematics Test 
40 Items 
Test Information Function 



25 




•3-2 -10 12 3 



THETA 

Information function ■ reciprocal of square of standard error of measurement. 



Source: U.S. Department of Education, National Ct.iter for Education Statistics, 
National Education Longitudinal Study of 1988: Base Year Survey. 
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APPENDIX D-3 



NELS:88 Grade 8 Science Test 
25 Items 
Test Information Function 



25 



20 



18 



10 



-2 



0 

THETA 



Information function - reciprocal of square of standard error of measurement 



Source: U.S. Department of Education, National Center for Education Statistics, 
National Education Longitudinal Study of 1988: Base Year Survey. 
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APPENDIX CM 



NELS:88 Grade 8 History Test 
30 Items 
Test Information Function 




Information function - reciprocal Of square of standard error of measurement 



Source: U.S. Department of Education, National Center for Education Statistics, 
"National Education Longitudinal Study of 1988: Base Year Survey. 
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Item Content Process 



Description 
# Options Source 



Reading 



1 


Literary 


Repro-Detail 


5 


NAEP-R 


2 


Literary 


Repro-Detail 


5 


NELS 


3 


L i terary 


Repro-Detail 


5 


NAEP-R 


4 


L 1 terary 


Inference/Eval 


5 


NELS 


5 


L i terary 


Inference/Eval 


5 


NELS 










Readinc 


6 


Science 


Repro-Detail 


5 


NELS 


7 


Science 


Inference/Eval 


5 


HSB 


8 


Science 


Comprehension 


5 


NELS 










Reading 


9 


Poetry 


Comprehension 


4 


3IBR-R 


10 


Poetry 


Inference/Eval 


4 


3IBr?-R 


11 


Poetry 


Inference/Eval 


4 


3IBR-R 


12 


Poetry 


Inference/Eval 


4 


3IBR-R 


13 


Poetry 


Inference/Eval 


4 


31BR-R 


14 


Poetry 


Inference/Eval 


4 


NELS 
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APPENDIX E-l 

Reading Comprehension Items 



Description of Reading Passages and Items 



Passage 1: A fable containing dialogue between two characters. 

Identify the objective of a character's course of action 
Identify a character's assumption in planning his actions 
Identify the reason the character's plan didn't work 
Choose which personalty trait is suggested by the story 
Choose the adage that best fits the lesson to be learned 

Passage 2: A paragraph relating events in geologic time and evolution 
to the span of a year. 

Demonstrate understanding of the time-line metaphor 
Choose the event the author seems least certain about 
Relate two events using the time-line 

Passage 3: A metaphorical poem consisting of parallels between the 
author's emotional crisis and a writing assignment 

Identify the tension or conflict implied in the poem 

Infer the meaning of a metaphor from the context of the line 

Evaluate personality traits suggested by the poem 

Choose the mood suggested by the tone of a phrase 

Identify the author's state of mind 

Identify an example of personification 
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Item Content Process 



APPENDIX E-l (Continued) 
Description of Reading Comprehension Items 
# Options Source Description of Reading Passages and Items 



15 
16 
17 
18 



19 
20 
21 



Biography 
Biography 
Biography 
Biography 



Literary 

Literacy 
Literary 



Comprehension 
Inference/Eval 
Inference/Eval 
Inference/Eval 



Inference/Eval 
Inference/Eval 
Inference/Eval 



4 
4 
4 
4 



4 
4 
4 



Reading Passage 4: A short biography of a Black musician. 



3IBR 
3IBR 
3IBR 
3IBR 

Reading 



3I6R 
NELS 



Evaluate the main purpose of the passage 

Define the meaning of a phrase 

Evaluate the tone of a character's remark in context 

Choose a statement supported by evidence in passage 

Passage 5: J^hortw^ay on the experiences of pioneer women in the 

J? 1 -** v Identify author's reason for a quote from a diarv 
Identify author's attitude toward pioneer women * 
Explain reason for a specified assumption 



Notes: asw? ;i; s r»»^- & txtf&s? the ™ ^ *. <™ «* 
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Item Content Process 



Descriptii 
# Options Source 



1 

1 


Algebra 


Ski 11 /Knowledge 


4 


HSB 


o 
c 


Data/Prob 


Und/Comp 


4 


HSB 


3 


Data/Prob 


Ski 11 /Know! edge 


4 


HSB 


4 


Algebra 


Und/Comp 


4 


HSB 


5 


Arithmetic 


Ski 11 /Know! edge 


4 


HSB 


c. 
O 


Adv. Topics Ski 11 /Know ledge 


4 


HSB 


7 


Algebra 


Und/Comp 


4 


HSB 


8 


Arithmetic 


Ski 11 /Know! edge 


4 


HSB 




Arithmetic 


Skill/Knowledge 


4 


HSB 


10 


Arithmetic 


Und/Comp 


4 


HSB 


11 


Geometry 


Und/Comp 


4 


HSB 


1 0 


Arithmetic 


Ski 11 /Knowledge 


4 


HSB 


1 7 
15 


Arithmetic 


Ski 11 /Knowledge 


4 


HSB 


m 


Algebra 


Und/Comp 


A 

4 


HSB 


ID 


Algebra 


Skill /Know! edge 


A 

4 


HSB 


16 


Arithmetic 


Skill /Knowledge 


A 
H 


UCD 


17 


Arithmetic 


Skill /Know! edge 


4 


HSB 


18 


Arithmetic 


Skill /Know! edge 


4 


NELS 


19 


Arithmetic 


Ski 11 /Know! edge 


4 


NELS 


20 


Arithmetic 


Und/Comp 


4 


NAEP 


21 


Data/Prob 


Und/Comp 


5 


NAEP 


22 


Arithmetic 


Ski 1 1 /Know! edge 


4 


NAEP 


23 


Arithmetic 


Problem Solving 


4 


NAEP 


24 


Data/Prob 


Und/Cutiip 


4 


NAEP 


25 


Geometry 


Ski 11 /Knowledge 


5 


NAEP 


26 


Algebra 


Und/Comp 


4 


NAEP 


27 


Algebra 


Und/Comp 


4 


NAEP 


28 


Arithmetic 


Problem Solving 


4 


NAEP 
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APPENDIX E-2 

in of Mathematics Items 
Item Description 



Compare 2 algebraic expressions, given values of variables 
Compare two numbers read from a graph 

Read two numbers from a graph and perform an operation with them 
Compare two algebraic expressions, given a relationship 
Perform an arithmetic operation and compare result with a number 
Determine coordinates of points on a graph, perform an operation 
Compare two algebraic expressions 

Perform an arithmetic operation, compare result with a number 
Perform an arithmetic operation, compare result with a number 
Compare statements about locations on two number line 4 ; 
Compare length of line segments illustrated 1n a digram 
Compare expressions involving mult, and division ot Integers 
Compare an integer with an expression using division of decimals 
Compare expressions, given information containing exponents 
Compare expressions, requiring solution of simple equations 
Compare two quantities of money expressed differently 
Compare two simple arithmetic expressions involving division 
Compare two simple arithmetic expressions involving division 
Compare two simple arithmetic expressions involving multiplic. 
Set up a simple equation that is the solution of a word problem 
Estimate a probability that is the solution of a word problem 
Determine the greatest of 4 decimal numbers 
Determine the smallest of 4 fractions in a word problem 
Choose verbal description of a prob. that doesn't match diagram 
Determine the length of a line segment in a diagram 
Evaluate a relationship given statements about the variables 
Find an algebraic expression odd or even given fact about var. 
Solve a word problem requiring logical inference 
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I I 



o 

ON 



T * am 
1 i. Cm 


Content 


Process # Options 


source 


cy 


Algebra 


Und/Comp 


5 




30 


Arithmetic 


Problem Solving 


4 


NAEP 


31 


Arithmetic 


Und/Comp 


4 


NAEP 


32 


Arithmetic 


Und/Comp 


4 


NAEP 


33 


Arithmetic 


Und/Comp 


5 


NAEP 


34 


Algebra 


Skill /Know! edge 


4 


NAEP 


35 


Adv. Topics Problem Solving 


4 


NAEP 


36 


Arithmetic 


Und/Comp 


4 


NAEP 


37 


Geometry 


Und/Comp 


4 


NAEP 


38 


Geometry 


Und/Comp 


4 


NAEP 


39 


Algebra 


Und/Comp 


4 


NAEP 


40 


Algebra 


Ski 11 /Knowledge 


5 


NAEP 



APPENDIX E-2 (Continued) 
Description of Mathematics Items 
Item Description 



Solve a word problem whose answer V, an algebraic expression 
Solve a word problem using multiplication or factoring 
Choose which decimal number is between two other numbers 
Choose points on a number line that Include a specified decimal 
Estimate a number using a percentage Indicated 1n a diagram 
Solve a simple algebraic equation 

Evaluate statements inferred from a word problem with a fraction 
Choose which expression is different from a specified percentage 
Solve a word problem requiring logical Inference 
Evaluate statements referring to area and diagonal of a diagram 
Supply number that completes an algebraic equation correctly 
Simplify an algebraic expression 
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i 



Descri 

Item Content Process # Options Source 



1 


Earth Sci 


Problem Solving 


4 


NAEP 


2 


Earth Sci 


Dec! Knowledge 


5 


NAEP 


3 


Chemistrv 


Und/Comp 


4 


NAEP 


4 


Sci Method 


Problem Solving 


4 


NAEP 


5 


Earth Sci 


Decl Knowledge 


5 


HSB 


6 


Life Sci 


Dec! Knowledge 


5 


HSB 


7 


Earth Sci 


Und/Comp 


4 


NAEP 


8 


Earth Sci 


Decl Knowledge 


4 


NAEP 


9 


Life Sci 


Decl Knowledge 


5 


NELS 


10 


Chemi strv 


Decl Knowledge 


4 


NAEP 


11 


Chemistry 


Comprehension 


4 


NAEP 


12 


Earth Sci 


Decl Knowledge 


5 


HSB 


13 


Life Sci 


Problem Solving 


4 


NAEP 


14 


Chemistry 


Problem Solving 


5 


HSB 


15 


Life Sci 


Decl Knowledge 


4 


NAEP 


16 


Life Sci 


Und/Comp 


4 


NAEP 


17 


Life Sci 


Und/Comp 


4 


NAEP 


18 


Earth Sci 


Decl Knowledge 


4 


NAEP 


19 


Chemistry 


Decl Knowledge 


4 


NAEP 


20 


Chemistry 


Problem Solving 


4 


NAEP 


21 


Earth Sci 


Und/Comp 


4 


NAEP 


22 


Life Sci 


Problem Solving 


4 


NAEP 


23 


Chemistry 


Problem Solving 


4 


NAEP 


24 


Sci Method 


Und/Comp 


5 


HSB 


25 


Life Sci 


Problem Solving 


5 


HSB 
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APPENDIX E-3 

ption of Science Items 
Item Description 



Infer geologic history from facts about limestone deposits 

Identify components of solar system 

Read a graph depicting solubility of chemicals 

Choose an improvement for an experiment on mice 

Choose a statement about source of moon's light 

Identify the example of a simple reflex 

Choose viable way of communicating on the moon 

Select statement about position of sun, moon, earth in diagram 

Identify source of oxygen in ocean water 

Choose the property used to classify a list of substances 

Explain lower freezing temperature of ocean water 

Answer question about the earth's orbit 

Infer use of oxygen from description of condition of aquarium 

Estimate temperature of a mixture 

Select a statement about the process of respiration 

Read a graph depicting digestion of a protein by an enzyme 

Explain location of marine algae 

Choose best indication of an approaching storm 

Choose the alternative that is NOT a chemical change 

Infer statement from results of an experiment using a filter 

Explain reason for late afternoon breeze from the ocean 

Select basis for a statement about a food chain 

Interpret symbols describing a chemical reaction 

Differentiate statements based on a model or an observation 

Describe color of offspring from a guinea pig cross 
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Item Content # Options 



APPENDIX E-4 

Description of History/Citizenship/Geography Items 
Source Item Description 



1 
2 
3 
4 
5 
6 
7 
8 
9 

10 
11 

- 12 
S 13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 



Geography 4 
History 4 
Citizenship 4 
History 4 
Citizenship 2 
Citizenship 2 
Citizenship 2 
Citizenship 2 
Citizenship 2 
History 4 
History 4 
Geography 4 
History 4 
History 4 

Citizenships 
Citizenship 5 
History 4 
History 4 
Citizenship 5 
History 4 
History 4 
Citizenship 4 
Citizenship 4 
Citizenship 4 
History 4 
Geography 4 
History 4 
History 4 
History 4 
Citizenship 4 



NAEP 

NAEP 

NAEP 

NAEP 

NAEP 

NAEP 

NAEP 

NAEP 

NAEP 

NAEP 

NAEP 

NAEP 

NELS 

NAEP 

NAEP 

NAEP 

NAEP 

NELS 

NAEP 

NAEP 

NAEP 

NAEP 

NAEP 

HSB 

NAEP 

NAEP 

NAEP 

NAEP 

NAEP 

HSB 



13t> 



iw&tiSH if"? nl?i 1 y l1cat1n ? h0 7 people have f0 <* 

tSLII-J °i a C1vi1 War era institution 

uSStlii ! Sf r ! se , th ?J 1s N0T a constitutional right 

HE5 I y a h ]! tor1 «Hy important manufacturing technique 

irtlrtl* !i e H! er an action is 1{ * a1 or not lwal 
SSISiS t^K 6 " an act1on 1s ^ not lela 
K2i2i! an action 1s le 9 a1 cr not legal 

ft«Si e Whether an act1on 1s legal or not legal 

Complete a statement about Immigration patterns msionca1 event 

S fj e corre ct option concerning the U.S. Congress 
Choose the correct option concerning the U S Coraras 
dentify the organization described congress 

iStifJ Inl X ° f an 1m P° rtant historical document 

St fy ^ tatuSTSf"!*^ imP ? rtant M'tJTlttl document 
txIL*** L , feature of U.S. homes at a specified time oprinri 
dent fy the location and time of an important h stor cal S 

Identify the principle exemplified by a specified ?egl™ir em ent 
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READ-LIT READ-SCI 



READ-LIT 


1.00 


0.46 


READ-SCI 


0.46 


1.00 


READ-POE 


0.48 


0.48 


READ-BIO 


0.46 


0.46 


READ-HST 


0.41 


0.40 


ARITH 


0.' / 


0.54 


ALGEBRA 


0.46 


0.51 


GEOMETRY 


0.17 


0.20 


PROBILTY 


0.31 


0.34 


EARTHSCI 


0.42 


0.44 


LIFE SCI 


0.42 


0.43 


CHEMISTR 


0.35 


0.40 


SCI METH 


0.29 


0.30 


HISTORY 


0.47 


0.48 


C IT/ GOVT 


0.47 


0.47 


CEOG/EC 


0.42 


0.43 


PROBILTY 


EARTHS! 


READ-LIT 


0.31 


0.42 


READ-SCI 


0.34 


0.44 


READ-POE 


0.32 


0.45 


READ-BIO 


0.31 


0.43 


READ-HST 


0.29 


0.40 


ARITH 


0.49 


0.55 


ALGEBRA 


0.46 


0.51 


GEOMETRY 


0.19 


0.22 


PROBILTY 


1.00 


0.35 


EARTHSCI 


0.35 


1.00 


LIFE SCI 


0.33 


0.50 


CHEMISTR 


0.34 


0.47 


SCI METH 


0.22 


0.33 


HISTORY 


0.35 


0.54 


CIT/GOVT 


0.37 


0.51 


GEOG/EC 


0.33 


0.49 



APPENDIX F 
Intercorrelations of Testlets 



READ-POE READ-BIO READ-HST 

0.48 0.46 0.41 

0.48 0.46 0.40 

1.00 0.53 0.47 

0.53 1.00 0.52 

0.47 0.52 1.00 

0.54 0.51 0.48 

0.53 0.51 0.46 

0.21 0.21 0.20 

0.32 0.31 0.29 

0.45 0.43 0.40 

0.47 0.45 0.40 

0.40 0.38 0.36 

0.33 0.31 0.29 

0.50 0.49 0.44 

0.50 0.50 0.45 

0.45 0.45 0.42 

LIFE SCI CHEMISTR SCI METH 

0.42 0.35 0.29 

0.43 0.40 0.30 

0.47 0.40 0.33 

0.45 0.38 0.31 

0.40 0.36 0.29 

0.54 0.54 0.36 

0.52 0.52 0.34 

0.20 0.23 0.14 

0.33 0.34 0.22 

0.50 0.47 0.33 

1.00 0.43 0.33 

0.43 1.00 0.29 

0.33 0.29 1.00 

0.49 0.45 0.34 

0.49 0.44 0.34 

0.46 0.43 0.32 



ARITH 


ALGEBRA 


GEOMETRY 


0.47 


0.46 


0.17 


0.54 


0.51 


0.20 


0.54 


0.53 


0.21 


0.51 


0.51 


0.21 


0.48 


0.46 


0.20 


1.00 


0.80 


0.32 


0.80 


1.00 


0.32 


0.32 


0.32 


1.00 


0.49 


0.46 


o.i: 


0.55 


0. 51 


0.22 


0.54 


0.52 


0.20 


ft C / 

0.54 


ft CI 

0. 52 


0.23 


0.36 


0.34 


0.14 


0.56 


0.54 


0.23 


0.58 


0.56 


0.23 


0.53 


0.51 


0.22 


HISTORY 


CIT/GOVT 


GEOG/EC 


0.47 


0.47 


0.42 


0.48 


0.47 


0.43 


0.50 


0.50 


0.45 


0.49 


0.50 


0.45 


0.44 


0.45 


0.42 


0.56 


0.58 


0.53 


0.54 


0.56 


0.51 


0.23 


0.23 


0.22 


0.35 


0.37 


0.33 


0.54 


0.51 


0.49 


0.49 


0.49 


0.46 


0.45 


0.44 


0.43 


0.34 


0.34 


0.32 


1.00 


0.64 


0.55 


0.64 


1.00 


0.54 


0.55 


0.54 


1.00 



Source: U.S. Department of Education, National Center for Education Statistics, "National Education Longitudinal 
Study of 1988: Base Year Survey." 
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APPENDIX G 



Definitions of Proficiency Scores 



Each proficiency score level was marked by four items, which were chosen as 
having similar difficulty and content. Success, or "passing" a level, was defined as 
answering at least three of the four items correctly. As described in the text of the 
report, two such levels were defined for Reading, and three for Mathematics. The 
sequence numbers of the items selected for determining the proficiency levels are listed 
below along with their content classifications and a brief description of the item itself. 



Reading 

Level 1: Simple reading comprehension including ieproduction of detail and/or the 
author's main thought 



1 Repro-Detail 

2 Repro-Detail 

3 Repro-Detail 
16 Repro-Detail 



Identify the objective of a character's action 
Identify character's assumption in planning action 
Identify the reason the character's plan didn't work 
Define the meaning of a phrase 



Level 2: Ability to make inferences beyond the author's main thought and/or 
understand and evaluate relatively abstract concepts. 



5 Inference/Eval 
10 Inference/Eval 

13 Inference/Eval 

14 Inference/Eval 



Choose adage that best fits the lesson to be learned 
Infer the meaning of a metaphor from context of line 
Identify the author's state of mind 
Identify an example of personification 



Mathematics 

Level 1 : Simple arithmetical operations on whole numbers 

16 Proc/Decl Compare two quantities of money expressed differently 

17 Proc/Decl Compare two simple arithmetic expressions involving 

division of integers 

19 Proc/Decl Compare two simple arithmetic expressions involving 

multiplication of integers 

20 Proc/Decl Set up a simple equation involving addition or subtraction 

of integers that is the solution of a word problem 

Level 2: Simple operations with decimals, fractions, and roots 

5 Proc/Decl Perform an arithmetic operation (square root) and 

compare result with a number 
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13 Proc/Decl Compare an integer with an expression using division of 

decimals 

14 Proc/Decl Compare expressions, given information containing 

exponents 

18 Proc/Decl Compare two simple arithmetic expressions involving 

division 

Level 3: Simple problem solving, requiring conceptual understanding and /or the 
development of a solution strategy 

11 Problem Solving Compare length of line segments illustrated in a diagram 
36 Comprehension Choose which expression is different frc.i; a specified 

percentage 

39 Comprehension Supply number that completes an algebraic equation 

correctly 

40 Proc/Decl Simplify an algebraic expression 

Assigning students to one of three proficiency categories for Reading (below 
Level 1, proficient at Level 1 but not Level 2, and proficient at Level 3) and four 
analogous categories for Mathematics was a straightforward process for the majority of 
test-takers. Even if a student had omitted one or more items in a 4-item cluster, a 
pass/fail determination could be made as long as the remaining three items had been 
answered correctly, or at least two were answered incorrectly. 

Problems in identifying a student's proficiency level could arise from one of two 
conditions. First, a student might not answer enough items at one or more levels to 
meet either the 3-correct (pass) or 2- incorrect (fail) criterion. This might possibly due 
to lack of motivation to complete a "no risk" test, or a reluctance to guess that seems to 
characterizes some students. As pointed out in the text section on speededness, 
insufficient time to complete the test was unlikely to have been a factor. The second 
possible problematic response pattern is a "reversal", that is, passing a more difficult 
level after failing an easier one. Such a reversal pattern might be a result of a few 
careless mistakes combined with a few lucky guesses, or, again, could be related to 
motivation. In any case, it would be inconsistent with the hypothesized hierarchical 
model. 

Proficiency scores on the Reading test could be determined directly for 96% of 
the students who had taken the test. Only about 3% of the students answered too few 
items to be classified, and 1% had the only possible reversal pattern: fail Level 1, pass 
Level 2. Success in classifying students on the Reading test was probably due to several 
factors. The Reading test was the first test in the booklet, so unmotivated students may 
not yet have gotten tired of responding. Only two levels, eight items, were required, 
most of which fell in the first part of the test. And with only one reversal pattern 
possible, the potential for inconsistencies due to guessing was minimal. NCES staff 
members decided that the 4% rate of unclassified students did not warrant attempts at 
resolution. 
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Assignment of Mathematics proficiency scores was a considerably more complex 
process. Determinations based on the students' item responses alone resulted in only 
86% of the students being classified About 8.5% of the students had omitted too many 
items to be categorized, and another 5.5% had reversals. Again, several factors were at 
work. Three of the four Level 3 items fell at or near the end of the Mathematics 
section, where they were least likely to be answered either by the few students who ran 
out of time or by those not motivated to finish. Mathematics had more proficiency 
levels, three, consisting of more items, twelve, than were required for classification in 
Reading. And the potential for reversals was greater: with three levels, there are four 
different ways a reversal could occur. The 14% missing data rate for mathematics 
proficiency scores was unacceptably high. In particular, it appeared that population 
estimates of mathematics proficiency might be biased upward if a substantial number of 
the lowest-ability students, who were more likely to have omitted some of the Ltvel 3 
items, were not scored. Evidence for this view was provided by the IRT formula score 
mean for students excluded for missing responses: it was nearly half a standard 
deviation lower than that of the total sample. 

A classification scheme was devised by a consensus of NCES staff and project 
staff that provided estimates of proficiency levels for about half of the missing 
Mathematics students. 

First of all, it was decided not to attempt resolution of the 5.5% of students who 
demonstrated reversal patterns. These students oM have enough items answered to be 
scored, but their classifications, for whatever reason, did not fit the hierarchical model. 
Moreover, since their IRT formula score mean was almost identical to that of the total 
sample, it appeared that omitting proficiency scores for these students would not 
introduce any systematic bias into the national estimates. 

The procedure for obtaining proficiency scores for students who had omitted 
critical items required a method of guessing of what those item responses would have 
been had they been there. The Item Response Theory (IRT) parameters described in 
the text of the report provided a means of obtaining estimates of item responses for 
each individual student. The formula presented in that section specifies the probability 
that a student at a particular ability level, theta, will answer correctly on a specific item, 
given the three parameters of that item: a (discrimination index), b (difficulty level), and 
c (the guessing parameter). 

A "simulated" right /wrong response to the item can then be obtained by, 
essentially, flipping a biased coin, with the amount of bias in the coin toss equal to the 
probability of a correct answer. Translated into operational terms, this means obtaining 
a computer-generated random number between 0 and 1, and comparing it with the 
probability of a correct answer provided by the formula. If the random number is less 
than or equal to the probability, the S'mulated response is "correct"; otherwise it is 
"incorrect." For example, if a particular student has a probability of getting a particular 
item correct equal to .75, then any random number up to and including .75 will produce 
an estimated correct response; a random number greater than .75 will be classified as 
incorrect. 
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Given a procedure for simulating answers to omitted items, NCES staff members 
specified a set of decision rules for resolutions that took into account the number and 
location of the missing items. Response patterns were grouped, and treated as described 
below. 

1) All students who omitted items at Level 1, but passed Levels 2 and 3, (designated 
_PP) were judged to have passed all three levels without resorting to simulation 
scores for the missing items. It was reasoned that if at least three out of four of 
the more difficult items were answered coirectly at both of the advanced levels, 
the student almost certainly was proficient at the lowest level as well. Similarly, 
students who failed the first two levels and omitted Level 3 items (FFJ were 
assigned a failing score at the highest level. If these students answered sufficient 
items at the two lower levels, and answered them incorrectly, it was highly 
unlikely that they possessed the skills to solve three out of four items in the most 
difficult cluster. 

2) The next three patterns treated consisted of students who had answered sufficient 
items to be classified at two of the three levels, and omitted items only at one 
level. In addition the location of the missing level, and the right /wrong 
designation of the remaining two, was such that the missing level could be 
resolved either way, pass or fail, and still produce a consistent (hierarchical) 
.esult. These three patterns were: 

PP_ (Pass Levels 1 and 2, omit items at Level 3) 
P_F (Pass Level 1, omit items at Level 2, fail Level 3) 
_FF (Omit items at Level 1, fail Levels 2 and 3) 

As can be seen, either a P or an F inserted in the blank spaces would produce an 
acceptable solution. For all students with these three response patterns, item 
responses were simulated for all omitted items in the blank level, regardless of 
how many of the four items were blank. Then the simulated correct responses 
were counted along with the actual correct responses, and a pass /fail score for 
the missing level was assigned based on the three out of four requirement. 

3) The remaining students had response patterns with either a missing designation at 
more than one level, and/or a pattern that indicated a potential for a reversal. 
Given the ambiguity, it was decided to implement the simulation procedure for a 
given level only if two or more items had been responded to at that level. If this 
relatively conservative treatment yielded either a consistent (hierarchical) pattern, 
or the _PP or FF_ patterns described in (1.) above, proficiency scores were 
assigned accordingly. If the constraint on the number of items simulated still left 
a blank level other than the two specified, or if the resolution produced a reversal 
pattern, proficiency scores were omitted for the student. 

The resolution process brought the proportion of students with missing 
proficiency scores down from 14% to 7.3%. Moreover, it brought the discrepancy 
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in formula score mean for the unscored cases down from half a standard 
deviation to about a tenth of a standard deviation. This is a good indication that 
the bias in estimates due to missing data has been considerably reduced. 
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Appendix H 

Standard Errors of Measureaent at Theta Scale Points 



Theta 

-3.0000 
-2.9000 
-2.8000 
-2.7000 
-2.6000 
-2.5000 
-2.4000 
-2.3000 
-2.2000 
-2.1000 
-2.0000 
-1.9000 
-1.8000 
-1.7000 
-1.6000 
-1.5000 
-1.4000 
-1.3000 
-1.2000 
-1.1000 
-1.0000 
-0.9000 
-0.8000 
-0.7000 
-0.6000 
-0.5000 
-0.4000 
-0.3000 
-0.2000 
-0.1000 
0.0000 
0.1000 
0.2000 
0.3000 
0.4000 
0,5000 
0.6000 
0.7000 
0.8000 
0.9000 
1.0000 



Reading 

1.7458 
1.6657 
1.5881 
1.5132 
1.4419 
1.3741 
1.3098 
1.2483 
1.1892 
1.1313 
1.0740 
1.0162 
0.9575 
0.8978 
0.8376 
0.7778 
0.7199 
0.6651 
0.6147 
0.5693 
0.5293 
0.4946 
0.4648 
0.4393 
0.4175 
0.3986 
0.3821 
0.3674 
0.3542 
0.3424 
0.3322 
0.3241 
0.3183 
0.3154 
0.3157 
0.3195 
0.3270 
0.3381 
0.3531 
0.3719 
0.3948 



Math 

1.4380 
1.3598 
1.2871 
1.2192 
1.1555 
1.0956 
1 .03g9 
0.9849 
0.9331 
0.8832 
0.8349 
0.7880 
0.7424 
0.6981 
0.6552 
0.6138 
0.5742 
0.5365 
0.5008 
0.4672 
0.4358 
0.4066 
0.3795 
0.3547 
0.3321 
0.3119 
0.2939 
0.2783 
0.2647 
0.2530 
0.2429 
0.2344 
0.2273 
0.2218 
0.2181 
0.2163 
0.2167 
0.2194 
0.2247 
0.2323 
0.2425 



Science 

1.6365 
1.5185 
1.4098 
1.3102 
1.2189 
1.1351 
1.0584 
0.9883 
0.9242 
0.8660 
0.8132 
0.7656 
0.7229 
0.6850 
0.6517 
0.6228 
0.5980 
0.5772 
0.5600 
0.5460 
0.5347 
0.5254 
0.5171 
0.5089 
0.4996 
0.4884 
0.4750 
0.4596 
0.4429 
0.4262 
0.4105 
0.3967 
0.3852 
0.3759 
0.3686 
0.3628 
0.3583 
0.3549 
0.3526 
0.3517 
0.3524 



HCG 

1.5644 
1.3409 
1.1543 
1.0003 
0.8743 
0.7719 
0.6895 
0.6236 
0.5617 
0.5314 
0.5008 
0.4780 
0.4617 
0.4503 
0.4427 
0.4377 
0.4345 
0.4323 
0.4304 
0.4282 
0.4253 
0.4215 
0.4167 
0.4112 
0.4050 
0.3978 
0.3894 
0.3792 
0.3674 
0.3543 
0.3411 
0.3291 
0.3192 
0.3119 
0.3071 
0.3043 
0.3032 
0.3035 
0.3052 
0.3083 
0.3128 
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Appendix H (con'd) 

Standard Errors of Measurement at Theta Scale Points 

(Continued) 



Theta 

1.1000 

1.2000 

1.3000 

1.4000 

1.5000 

1.6000 

1.7000 

1.8000 

1.9000 

2.0000 

2.1000 

2.2000 

2.3000 

2.4000 

2.5000 

2.6000 

2.7000 

2.8000 

2.9000 

3.0000 



Reading 

0.4217 

0.4528 

0.4883 

0.5281 

0.5725 

0.6216 

0.6755 

0.7343 

0.7983 

0.8675 

0.9420 

1.0220 

1.1076 

1.1987 

1.2954 

1.3978 

1.5055 

1.6188 

1.7371 

1.8605 



Math 

0.2552 

0.2704 

0.2883 

0.3089 

0.3321 

0.3581 

f.3869 

0.4184 

0.4528 

0.4902 

0.5307 

0.5745 

0.6217 

0.6725 

0.7272 

0.7860 

0.8490 

0.9165 

0.9886 

1.0656 



Science 

0.3551 

0.3602 

0.3680 

0.3788 

0.3928 

0.4099 

0.4102 

0.4535 

0.4797 

0.5084 

0.5397 

0.5733 

0.6094 

0.6480 

0.6891 

0.7328 

0.7793 

0.0289 

0.8814 

0.9373 



HCG 

0.3181 

0.3240 

0.3302 

0.3376 

0.3475 

0.3619 

0.3826 

0.4107 

0.4470 

0.4919 

0.5454 

0.6075 

0.6780 

0.7569 

0.8442 

0.9400 

1.0445 

1.1581 

1.2811 

1.4139 
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